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Abstract 

(N 

We propose an algebraic combinatorial framework for the problem of completing 
partially observed low-rank matrices. We show that the intrinsic properties of the 
problem, including which entries can be reconstructed, and the degrees of freedom 
in the reconstruction, do not depend on the values of the observed entries, but only 
on their position. We associate combinatorial and algebraic objects, differentials and 
matroids, which are descriptors of the particular reconstruction task, to the set of 
observed entries, and apply them to obtain reconstruction bounds. We show how 
similar techniques can be used to obtain reconstruction bounds on general Com- 
pressed Sensing problems with algebraic compression constraints. Using the new 
c/3 theory we develop several algorithms for low-rank matrix completion, which allow 

i Q | to determine which set of entries can be potentially reconstructed and which not, 

and how, and we present algorithms which apply algebraic combinatorial methods 
£NJ in order to reconstruct the missing entries. 
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1. Introduction 



Matrix Completion is the task to reconstruct low-rank matrices from a subset of its entries and 
occurs naturally in many practically relevant problems, such as missing feature imputation, multi- 
task learning [2], transductive learning ||12||. or collaborative filtering and link prediction UT1 I25I 
SI- 

With the nuclear norm heuristic having been applied with increasing success in the recon- 
struction of low-rank matrices fl5] 14011 , it has become increasingly important to analyze the po- 
tential and limitations of matrix completion methods. 

Existing approaches can be classified by the assumptions about the sampling procedure and 
the low-rank matrices whose entries are measured. Candes and Recht [5] analyzed the noise- 
less setting, and have shown under uniform sampling that incoherent low-rank matrices can be 
recovered with high probability. Salakhutdinov and Srebro II35II considered the more realistic 
setting where the rows and columns are non-uniformly sampled. Negahban and Wainwright 
II28II showed under the same row/column weighted sampling that non-spiky low-rank matrices 
can be recovered with large probability. Foygel and Srebro | |11 || have shown under uniform sam- 
pling that the max-norm heuristic H39II can achieve superior reconstruction guarantee under the 
non-spikiness assumption on the underlying low-rank matrix. 

All the above theoretical guarantees are built on some assumption on the sampling procedure, 
e.g., uniform sampling. In a practical setting, we always know which entries we can observe 
and which entries we cannot (the so-called mask). One may ask if we could obtain a stronger 
theoretical guarantee (of success or failure) conditioned on the mask we have. 

On the other hand, all the above theories are also based on some assumptions on the un- 
derlying low-rank matrix, which are usually uncheckable. Although it is widely known that we 
cannot recover arbitrary low-rank matrices (see, e.g., fl5, Equation (1.1)]), one may ask if there 
is a theory for matrix completion for almost all matrices, depending only on the mask. 

Following the expository paper of Kiraly and Tomioka [21 J, we view matrix completion as a 
problem lying in the intersection of two mathematical realms, combinatorics and algebra. Here 
the combinatorial structure arises from the masking pattern, which can be viewed as a bipartite 
graph, and the algebraic structure arises from the low-rank-ness of the underlying matrix. It is 
probably fair to say that previous studies (with some exceptions we mention below) have not 
paid enough attention to these underlying mathematical structures. 

Looking into more details about the combinatorial/algebraic structures of the problem allows 
us to derive novel necessary and sufficient conditions for any matrix completion algorithm to 
succeed in recovering the underlying low-rank matrix from some of its entries. Figure [I] shows 
how combinatorial properties of the mask, such as r-closable, 2r-regular, and r-connected relate 
to unique/finite (up to finite number of solutions) completability. Although these combinatorial 
properties are implied with high probability from the sampling models (e.g., uniform) depending 
on the expected number of observable entries, they had hardly been discussed in the literature. 
We first discuss these combinatorial properties for a fixed mask in detail, and then show when 
these conditions are satisfied with high probability under a certain sampling model of the mask. 

Another point that differentiates our work from previous studies is that we avoid making any 
explicit assumption on the underlying low-rank matrix. Although this may sound like a magic, yet 
it is possible when we consider completability for generic low-rank matrices, i.e., for almost all 
low-rank matrices allowing exception of measure zero. This is illustrated in Figure [2j Since it is 
clear that we cannot successfully recover any low-rank matrix, we also need to make exceptions. 
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Figure 1 : Combinatorial properties we discuss in this paper, "whp" means with high probability 
for sufficiently high sampling density (or expected number of observed entries.) 



However, the set of exceptional cases has zero measure. On the other hand, previous studies 
used some bound on some quantities (coherence/spikiness) that characterize the goodness of 
the matrix, which however results in a set of exceptional cases with a positive measure. 

Exploiting the algebraic/ combinatorial structures of the problem, we propose the notion of 
partial completability. Precisely, our algebraic-combinatorial tool allows us to tell which entry 
can be imputed and which entry cannot. Since an entry is (finitely) completable if and only if 
that entry has some algebraic dependence on the observed entries, this notion has a connection 
to matroid theory, which capture the notion of independence, dependence, and span for subsets 
of a finite set, which in this case is the set of entries of a low-rank matrix, making them the 
"right" tool for formalizing degrees of freedom. We propose a scalable randomized algorithm to 
identify which entries that may be recovered from the given ones, which can be considered as a 
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Figure 2: Difference between the conditions used previously in literature (non-spiky/incoherent) 
and the generic assumption we use in this paper. 
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generalization of an algorithm proposed in [38], but we rigorously prove its correctness. Note 
that even when an entry is not completable, the nuclear norm heuristic would give some result. 
However the result may not be reliable in that case. 

Furthermore, we propose a polynomial time algorithm to check for the property we call 
r-closable and at the same time actually perform matrix completion for a mask with such a 
property. We show that this approach can be superior to the well studied nuclear norm heuristic 
in some cases. In addition, we discuss the limitation of this approach, and how it is related to 
the more general notion of circuits of a matroid, which are not however as easy to compute as 
the r-closure. 



1.1 Results 

As the general overview in the previous section indicates, Matrix Completion has, until now, been 
analyzed predominantly in the context of convex optimization. Indeed, naively, one could think 
that the findings of Candes and Tao H6J, which optimally characterize the asymptotic bounds 
for reconstructability of a coherent true matrix, settle, once and for all, the problem of Matrix 
Completion and all that can be known about it. 

However, examining the literature more carefully, part of the theoretical and practical find- 
ings have already shown that the structural and computational phenomena in Matrix Comple- 
tion are far from being understood. On the theoretical side, for example, [38] have tried to 
analyze the identifiability of Matrix Completion from a combinatorial point of view. While their 
work, which relates Matrix Completion to the Framework Realizability Problem for rigid bar- 
joint frameworks, remains mainly conjectural, they are able to give conjectural statements and 
algorithms on the completability on partially known matrices which do not rely on the convex 
optimization setting but only on combinatorial features which were also observed in different 
contexts, see [30]. On the other hand, the practical findings in the existing literature are also far 
from being complete. While the existing results give rise to algorithms good asymptotic guaran- 
tees, they often fail for the case of small matrices or small samples. 

In this paper, we will explore both of these white spots on the map by taking into account the 
intrinsic structure of the Matrix Completion problem which has, until now, only been addressed 
to a marginal extent. It will turn out that Matrix Completion does not have deep relations to 
Functional Analysis, as it has already been observed, e.g., by |]3l |6j, but also with Combinato- 
rial Commutative Algebra, Algebraic Geometry, Matroid Theory, Deterministic and Probabilistic 
Graph Theory, and Percolation Theory. By combining these contributions to a closed whole, we 
obtain what we believe to be the right tools to investigate theoretical and practical aspects of 
Matrix Completion and a more general class of Compressed Sensing problems which Exhibit 
Combinatorial-Algebraic structure. 

Here is a summary of our main contributions in this paper, which can also serve as a guide 
for reading: 



• In sections 



2.1 and 2.2 we express the problem of Matrix Completion in terms of noisy 
parametric estimation and, for the first time, explicitly separate the generative model for 
the true matrix and the measurement process. This central part allows to treat the proper- 
ties of the measurement separately from the properties of the matrix; in fact, the genericity 
formalism introduced section 12.21 will allow to remove the influence of the true matrix 



almost completely in identifiability considerations. 
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In section |2.3[ under the assumption of generic sampling introduced in section |2.2| we 
apply, following some ideas from [21 J and some new, elementary techniques from Alge- 
braic Geometry and Combinatorial Commutative Algebra which allow us to parameterize 
and characterize the measurement process by a bipartite graph. We show that all proper- 
ties of the measurement, as for example degrees of freedom, as well as identifiability, are 
completely encoded by this graph and its algebraic properties, thus accessible to algebraic- 
combinatorial methods. As a practical counterpart, this implies that whether the true ma- 
trix can be reconstructed does, generically not depend on the values at the known entries 
but only their position. 

In section|Z4| we introduce a problem which is analogous to Matrix Completion. Instead of 
asking for a reconstruction of all missing elements of a matrix, we ask for a reconstruction 
of some. In particular, we ask the reconstruction of which missing entries is in principle 
possible. This task, which we term Partial Matrix Completion, has apparently not appeared 
in the literature yet, but is amenable to the techniques developed in the previous chapters, 
and, in our opinion, of high practical relevance since in general, not all entries are to be 
reconstructed. Our results include the fact that the set of entries which can be reconstructed 
from the measurements also depends only on the positions of the known entries, not their 
values. 



In section [275] we introduce, for the first time, combinatorial algebraic tools which allow us 
to practically characterize the degrees of freedom which are destroyed by the measurement 



process in terms of the graph defined in section 2.3 Matroid theory allows to further 
characterize the patterns which guide possible ways of reconstruction, and the theory of 
differentials gives a grasp on their calculation. 



The theory developed in section 2.5 gives rise to several randomized algorithms, later 
presented in section 3.1 which are efficiently able to determine which missing entries can 
in principle be reconstructed, and which not. A special case is the conjectural algorithm 
proposed by [38 J, but also includes more general applications including the Partial Matrix 



Completion problem introduced in section 2.4 In particular, we present an algorithm 



which answers the question which entries can be in principle reconstructed, given the 
positions of the known entries. 



The analysis of a special reconstruction pattern discussed in section 2.5 motivates a novel 



algorithm which can perform reconstruction on the bipartite graph counterpart, which we 



describe in section 3.2 By adding algebraic calculations onto the purely graph theoretical 



foundations, we obtain in section 3.3 an first-of-its-kind-algorithm which performs Matrix 



Completion not via optimization, but via Combinatorial-Algebraic calculations. 

Since a graph parameterizes the measurement process, random measurements correspond 



to random graphs. In section 2.6 we formalize this correspondence and, as an application 



obtain bounds for the number of measurements which is necessary for reconstruction. 
Since a graph parameterizes the measurement process, random measurements correspond 



to random graphs. In section 2.6 we formalize this correspondence and, as an application 
obtain bounds for the number of measurements which is necessary for reconstruction. Also, 
we provide theoretical and conjectural evidence for phase transition phenomena which can 
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be and have been observed in practial Matrix Completion settings, as well as explanations 
for why and how classical Matrix Completion algorithms fail on certain classes of measure- 
ments. 



• In section |2.1| it has been shown that the conditioning on the true mask can be removed 
in the analysis of identifiability. In section [ZT] we show that this is a general principle: we 
develop a theory for Compressed Sensing under algebraic compression constraints (i.e., the 
constraints can be expressed by polynomial equations) and prove an upper probabilistic 
bound on the sufficient number of samples which are needed for reconstruction, which 
only depends on properties of the constraints, and not on the properties of the true signal. 
For the special case of Matrix Completion, we obtain sufficient bounds which are similar to 
those of [3] and [6], but now with the conditioning on the incoherence of the true matrix 
completely removed. 

• In the experiment section we underline our theoretical findings, conjectures, and practi- 
cal claims with evidence from simulations. In section |4.1| we show how the number of 
reconstructible entries behaves with increasing sample size, and in section 4.2 we com- 
pare the various theoretically predicted and practically observed phase transitions of in the 
Matrix Completion problem, amongst those the identifiability phase transition. Moreover, 
we compare the performance of the various known and novel algorithms to these phase 
transitions. 

• Appendix|A|contains a technical treatise on sheaves of matroids on schemes (e.g., algebraic 
varieties). It summarizes the genericity properties of a matroid of sections, when evaluated 
at different points of the scheme. While the results presented there are folklore and maybe 
not surprising, we decided to have them included since they seem not to be written up in 
the existing literature. 

Summarizing, this paper contains many results of theoretical and practical type, which do not 



necessarily need to be read in sequence to be understood. Sections 2.1 and 2.2 are fundamental 



and suggested reading, while sections 2.3| 2.6 or \2.7\ can serve as independent starting points. 



Also, the algorithms should be accessible (though not completely understandable) without hav- 
ing read the theory part. 
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2. Theory of Low-Rank Matrix Completion 

In this section, it is derived how Low-Rank Matrix Completion is properly formulated as a para- 
metric estimation task. Then, different approaches of sampling are discussed, arguing that the 
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generic algebraic framework is the most proper way of approaching the problem. Subsequently, 
novel Algebraic Combinatorial tools will be derived for exploiting the inherent structure of Low- 
Rank Matrix Completion, which allow to construct methods to solve and understand the features 
of Matrix Completion from the structural, algebraic point of view. 



2.1 What is Low-Rank Matrix Completion? 

Matrix Completion is the task of imputing missing entries in a matrix, given other, known, but 
possibly noisy entries. Low-Rank Matrix Completion is doing that under assumption of a low- 
rank model, that is, informally: 

Problem 2.1.1. Let A be a matrix, in which some entries are missing, and some entries are known. 
Given some target rank r, find a matrix A' of rank r, close to A 



From both a mathematical and procedural point of view, Problem |2. 1.1 is ill-defined. The 



standard way of parameterizing and well-posing the Matrix Completion model is assuming a 
generative truth, i.e., that there exists a true matrix A, of which one observes some entries, plus 



observation noise. Thus, under the generative assumption, Problem 2.1.1 reformulates to 



Problem 2.1.2. Let AG C mxn be an unknown matrix, of known rank r. Let s e (C U {oo}) mxn be 
a noise matrix. Given the observed matrix A + e, reconstruct A 

In this description of Low-Rank Matrix Completion, the model of the truth is well-defined, 
but without assumptions on A and the noise e, it is practically useless. Thus, as it is common 
practice in statistics and learning theory, in order to obtain a proper, well-defined and practically 
applicable model, one needs to 

• (i) separate the generative model from the noise model. That is, separate the fact which 
entries are observed from the accuracy with which they are observed, if they are observed. 

• (ii) specify the generative sampling model. That is, introduce and specify random variables 
for sampling A, and the set of observed entries. 

• (iii) specify the observational noise model. That is, introduce and specify random variables 
for the noise e. 

Also, measures of closeness from the observations to the putative truth need to be defined when 
attempting to reconstruct and give reconstruction guarantees; however these are explicitly not 
part of the model, but also need the above three points to be fulfilled to allow for proper evalua- 
tion. 



2.1.1 Separating Generative and Noise Models 

In order to perform (i) the separation of generative and noise models in a sound way, we need 
to introduce mathematical notation to parameterize the generative sampling process. First, we 
introduce notation for the set of all low rank matrices, from which the truth will be sampled: 

Notations 2.1.3. The set of all complex (m x n)-matrices of rank r or less will be denoted by 

M(m x n, r) = {A e C mxn ; rkA < r}. 
We will always assume that m<n;by transposing the matrices, this is no loss of generality. 
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It will become important later that the set M(m x n, r) is the solution of a set of polynomial 
equation in the matrix entries - the minor equations - making M(m x n, r) an algebraic vari- 
ety. M(m x n, r) is often called the determinantal variet}[^] while subsets of it may be called a 
determinantal variety. 

Next, we need to introduce notation for the process of specifying the observed entries of the 
matrix: 

Definition 2.1.4. A map Q : C mxn — * C a which sends a matrix to a fixed tuple of its entries, i.e. 

n : (a o 0i|i|m ~ (a ilk , a iik , a iJa ) , 

is called masking in rank r. Such a map is uniquely defined by the set of entries i^j^ in the image 
set. When clear from the context, we omit the qualifier "in rank r". 

We call the unique matrix which has ones at those entries, and zeroes elsewhere, the mask of Cl 
and denote it by M(f2). Similarly, we will call a matrix M having only ones and zeroes a mask, and 
the map Cl such that M(Ci) = M the masking associated to M. When no confusion is possible, we 
will denote it by Cl M and implicitly assume that the rank r is fixed. 



Note that Definition 2.1.4 allows for an entry to be observed several times; that may be use- 
ful if the observation is noisy. However, in the rest of the paper, we will not explicitly make use 
of this fact, so we will assume that no entry is observed twice, i.e., the bituples are all 

different. 



Naturally, we will be interested in the behavior of a masking Cl when its range is restricted to 
the low-rank matrices M(m x n, r). Before proceeding to reformulating the Matrix Completion 
model, we give examples for the definitions above: 

Example 2.1.5. For any m, n e N, one has M(m x n, m) = C mxn . 

The simplest non-trivial examples of determinantal varieties are the square co-rank one matrix 
varieties: 

M(n x n, n - 1) = {A e C nxn ; detA = 0}. 
For example, the co-rank one (2 x 2)-matrices are 



1(2x2,1) 



a ll a 21 
a 12 a 22 



; a ll a 22 — a 12 a 21 



Example 2.1.6. Consider a true (3 x 2)-matrix 



A- 



li 



f a 

a 12 
V fl 13 



a 21 
a 22 
a 23 



a 31 
a 32 
a 33 J 




: To be more precise, the usual determinantal variety is the projective closure of the affine variety M(m x n, r) 
we define. However, the generic behavior under algebraic maps, including fiber and image dimensions, does not 
change fundamentally when restricting the projective morphisms to the images and pre-images of the affine variety 
M(m x n, r) which is dense in its projective closure. 
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which has rank one. By observing five entries (exactly, i.e., without noise), one may arrive at one of 
the following two partial matrices: 



At 



f 1 2 3 "\ 
4 



V4 



f 1 



or A 2 



J 



3 A 



V 4 12 J 



The masks corresponding to the two matrices are 



fun 
Mi = o i o 

V 1 



J 

so the corresponding maskings are the maps 



f i o i A 

and M 2 = I 1 
V 1 1 



fti : 



f a ll a 21 a 31 "\ f a ll a 21 a 31 

V fl 13 



and 



«11 


a 21 


a 31 


a 12 


a 22 


a 32 


a 13 


a 23 


a 33 


an 


a 21 


a 31 


a 12 


a 22 


a 32 


«13 


a 23 


a 33 



a u a 31 A 

a 22 

a 13 a 33 J 



In particular, one has M t = M(r2 ; ), and A { = Q t (A) for i — 1,2. Also note that Q t could have 
been expressed by the map which sends A to the Hadamard product, i.e., the componentwise product, 
AoMj. 

Thus, the generative sampling process is modelled by applying some masking Q to the true 
matrix which is in the range, while the noise acts on the image of Q, formally separating both by 
the map given by Q. 



2.1.2 The Parametric Model of Low-Rank Matrix Completion 



Using the notations and definitions introduced in section 2.1. 1| we can now provide a complete 
model description for Low-Rank Matrix Completion: 

Problem 2.1.7. Let r be the true rank, let Abe a M(m x n, revalued random variable, modelling 
the sampling of the true matrix. Let M be a (m x n)-mask-valued random variable modelling the 
position of the observed entries, and a = \\M\\ X the integer-valued random variable which is the 
number of observed entries. Let e be a C a -valued random variable, modelling the noise. 
Then, construct a good estimator for A which takes £l M (A) + e as input. 

What is chosen for particular sampling distributions on A and M is different throughout the 
literature, similarly the noise model e. While choices for sampling A and M will be thoroughly 
discussed in the next section, we will not put much emphasis on the noise model e yet, while 
acknowledging that it is extremely important for practical purposes. However, identifiability of 
the generative model is independent of the noise e while it is well-behaved; in fact one direction 
of this claim is straightforward to see, and we summarize it in the following important remark: 
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Remark 2.1.8. If it is impossible to identify A from Cl(A), then there can exist no consistent estimator 
for A which makes no use of hidden knowledge. 

Even more can be seen to be true, as the following result shows: 

Theorem 2.1.9. There exists a consistent deterministic estimator for the true matrix A if and only 
if A is identifiable from Q(A). 



Proof. Remark 2.1.8 gives one direction, so what remains to prove that there exists an algorithm 
to estimate A from Q(A) + e in a consistent manner, assuming A is identifiable from f2(A). 

We will in fact give an explicit, algorithmic estimator for A and prove that it is consistent if 
A is identifiable from Q(A). Let B = £1(A) + ^= be the measured matrix, let M(m x n, r) be the 

set of m x n matrices of rank at most r. Then, we will consider the estimator A N = A N (B) for 
A which outputs the pre-image f2 _1 (P) of a point P on £XM(m x n > r )) such that P is closest to 
B with respect to the Euclidean distance on C mxn = M 2m ". If there are more than one points on 
ri(M(m x n, r)) having the same distance to B, the A N will output the pre-image of the one which 
is smallest with respect to the lexicographic ordering on C mn induced by the isomorphism with 
M 2mn . By construction, A N takes only values in M(m x n, r). We have to show that (i) A N can be 
calculated by a deterministic algorithm, and (ii) A N is consistent in N. 

(i) First we prove that A N can be calculated by a terminating, deterministic algorithm. Recall 
that the determinantal variety M(m x n, r) is classically known to be an algebraic variety, and so 
is its image X = f2(M(m x n, r)). This means, there are polynomials f 1} . . . such that a point 
x e c mxn is in X if and only if /;(x) = for all 1 < i < k. For example, one can start with the 
minor equations of the matrix A, considering the matrix entries as indeterminates, and eliminate 
all entries which are unknown. 

First we estimate the noiseless Q(A) by a set estimator S(iV) which can be written as a deter- 
ministic minimizer of a loss function L, i.e., 

S(iV) = argmin Ce0(M(mxn r)) L(C) where L(C) = ||C - B\\j 

(where in case of ambiguity we take the set of all C which is compact). Notice that the functional 
I is a non-constant and positive polynomial, so a minimum of it can be computed as follows: 
take the derivative 

3L(C) 

f(c) = — 

and calculate all solutions of /i(C) = /2(C) = ••• = /^(C) = £(C) = 0, e.g. with existent 
(deterministic and terminating) symbolic-numeric methods which can provide certified solutions 
of arbitrary accuracy. Differently formulated, this amounts to calculating the set 

S(N) = V(/i, • • • ,fk, = {C£ C mx " ; /j(C) = / 2 (C) = • • • = A(C) = 1(C) = 0}. 

Note that by construction, the set V(/ 1; ...,f k ,£) is always non-empty. If it is finite, the smallest 
point can be determined by a simple comparison, if it is infinite, then S(iV) has to be a sphere, 



2 here, consistent is defined by the variance convention: for the observed matrix B = f2(A) + -jg, some scaling 
factor N , and centered noise e with finite variance, the estimator A[B) converges in probability to A for JV — > 00. This 
is equivalent to observing each noisy entry of f2(A) with multiplicity N and taking the number of samples convention 
for consistency in the number of observations. 
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and the smallest point can be determined symbolically. A point A N in the pre-image Q— l(S(iV)) 
can also be determined symbolically-numerically. 

(ii) Now we prove consistency of A N in N. Since A is identifiable from fi(A); it means that Q 
is injective at A. Since Q is a complex algebraic map, Q is also injective in an (complex) open 
ball U = B(A, 5) around A. Since e has finite variance, there exist c > (e.g., c = |) and M > 
such that S(iV) will be contained in B(B, 5/iV c ) for all N > M with high probability Thus, any 
series of points % e S(iV) will converge in probability to B for N — > oo. Since f2 was injective on 
U, this implies that any series of pre-images r2~ 1 (5 JV ) will converge in probability to A = 
In particular, A N — > A converges in probability. □ 



In particular, Theorem 2.1.9 shows: if A cannot be identified from f2(A), then no algorithm 
without hidden knowledge can reconstruct all the missing entries in f2(A). 

This stringently motivates the analysis of properties of Q alone, since the statement is inde- 
pendent of the noise e under the condition of well-behavedness of the latter. 



The estimator given in the proof of Theorem 2.1.9 is in general very inefficient and is pro- 
posed only for the purpose of completing the statement which relates the exact algebraic mor- 
phism to identifiability of a statistical problem. In fact, we do not expect that there is a much 
better algorithm that applies to any true matrix. In his PhD thesis, Michael Dobbins [9 proved, 
via a reduction to the Polytope Realization Problem [32, Parts I — II] that: 

Proposition 2.1.10. Deciding if a partial matrix has a low-rank completion is as hard as deciding 
if any set of polynomial equations has a solution. 

This problem is known to be in PSPACE, is at least NP-hard. (This particular formulation of 
hardness, as well as a discussion from the complexity-theoretic perspective is in [13711 .) Thus, to 
obtain efficient algorithms, we will need to make some kind of assumption on the input. Our will 
be that it is generic, a concept that we describe next. 



2.2 Genericity 

In this section, we discuss sampling assumptions for the generative model of the mask M defining 
the masking Q, and the true matrix A. We introduce a new, algebraic genericity assumption for 
the true matrix, which will allow later to remove the influence of the sampling of the true matrix 
A onto the behavior of Q. As in applications, the mask is known, while the sampling assumptions 
on the true matrix A and the true matrix A itself are in general unknown, we will argue that this 
is at the same time the most natural an weakest possible assumption on the sampling of the true 
matrix A. 



2.2.1 Sampling of the Mask 

Several ways of sampling the mask have been considered in the literature. Table [I] gives a short 
list of sampling distributions for the mask M. There are also different sampling assumptions 
like the CLP or the Power Law model, which will however not be discussed in this paper. In the 
Erdos-Renyi model, the number of edges a = \\M\\i is binomially distributed as a ~ S(p,mn), 
with expected value mn-p and variance mn-p(l-p). Thus, the relative variance, i.e., the variance 

3 There exist C > and a > such that the probability is at least 1 — 
4 We thank GA^nter Ziegler for reminding us about Dobbins's results. 
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fixed mask 


M 


the mask M is fixed 


uniform 


U(m x n,k) 


the number of observed entries a = \\M\\i is fixed, 
otherwise the sampling is uniform 


Erdos-Renyi 


G(m x n,p) 


each entry of M is independently 
Bernoulli distributed with parameter p 



Table 1 : Possible sampling assumptions for the mask 



of — , is mn which approaches zero as mn — » oo, so the qualitative behavior of G(m x n,p) 
will approach that of U(m x n, mn • p) in the limit. 

Note that in practical applications, it is always possible to identify the result of the mask 
sampling, since it is - tautologically - always known which entries of the true matrix are known, 
and which entries are unknown. 

2.2.2 Sampling of the True Matrix 

Table [2] gives a short list of sampling assumptions for the true matrix A. Note that usually, the 
specific distribution from which the sampling occurs is not specified, only properties are specified 
which hold for the sampling distribution or the sampled matrix. The reason for this is that the 
relevant statements hold for any sampling distribution fulfilling these assumptions. 



incoherent 


for the singular value decomposition A = USV T , and a global constant C, 
it holds that max, ; 1 1 t/f T - 1 1 2 < -j= and maxv ,■ ||[/,-,|| 2 < -7= 


non-spiky 


there exists a global constant C bounding the spikiness ratio from above, 
i.e., it holds that mn ■ ||A||£, • ||A||- 2 < C 


(Zariski-) generic 


(algebraic) subsets with Lebesgue/Hausdorff measure zero 
have (conditional) probability zero 



Table 2: Possible sampling assumptions for the mask 



The most common strategy is to restrict the sampling of matrices to a subset of all matrices, 
as in incoherence [5] or non-spikiness H29II . We propose a weaker condition, inspired by the 
Zariski topology^ on algebraic sets: we only postulate that the sampling distribution of the true 
matrix assigns no positive probability to any Hausdorff measure zero subset of M(m x n, r). For 
what concerns the following, one could also postulate that only for irreducible algebraic subsets. 
This Zariski-like condition is indeed weaker, as the coherent, or spiky matrices form a set of 
positive probability measure in general. In fact, any continuous probability distribution fulfills 
the postulate, in particular also any distribution supported on non-spiky or coherent matrices. 

As the underlying sampling process is unknown in practical applications, as opposed to the 
chosen mask, we argue that the proposed sampling for true matrices is the weakest possible 

5 In the Zariski topology, the closed subsets of C' 1 are exactly those which can be written as set of solutions for a 
finite set of polynomials; i.e., U c C" is open if and only if there are polynomials p u . . . ,p m in n variables such that 
U = {x e C™ ; Pi(x) ^ for some 1 < i < n}. The closed sets are called algebraic sets, and carry the inherited Zariski 
topology. Zariski closed sets and relatively Zariski closed sets have probability measure zero under any continuous 
random variable, see the appendix of II 23 II for more details. 
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condition which excludes sampling degeneracies. Thus, we will term it generic sampling, and 
any random variable fulfilling this condition will be termed generic, or generic matrix. In the 
next section we will see in fact that under generic sampling assumptions, the behavior of the 
masking operation and its identifiability depends only on the properties of the mask, and is 
completely independent of properties of the true matrix. 

Furthermore, different scenarios restrict to symmetric/Hermitian or antisymmetric/anti-Hermitian 
matrices, and/or real matrices, as opposed to non-symmetric complex matrices. While these as- 
sumptions usually change the results, they do not fundamentally. We will discuss these sampling 
assumptions additionally when appropriate. Other sampling assumptions include definiteness 
or sign assumptions on the true matrix. As these conditions are semi-algebraic and completely 
change the flavor of the problem, they will not be discussed in this paper. 

For formal purposes, we want to state our definition of (algebraic) genericity: 

Definition 2.2.1. Let P be some property on the matrices C mxn , such that the set X of P matrices 
is admits a Hausdorff measure (e.g., when X is an algebraic variety), let Q be any property on the 
matrices C mxn , let Y be the set of matrices that are not Q. We say that 

"A generic P matrix is Q" 

if the set X (~\Y is a negligible set (i.e. a subset of a null set) under the Hausdorff measure on X. 

The given definition is a bit broader than the usual concept of genericity applied for moduli 
spaces of algebraic objects, but is, for the current setting, maybe the most intuitive one, without 
making any difference in the consequences, since it implies that any matrix valued random vari- 
able with continuous support on the P matrices will fulfill Q almost surely. Indeed, in the case 
that P and Q define algebraic sets, the definitions (the given one, and very generic/general for 
the moduli space) are equivalent. A more detailed comparison and relation of different concepts 
of genericity, and how they imply each other, can be found in the appendix of [23]. 

To give further intuition for this concept of genericity, and its meaning in the world of low- 
rank matrices, we give some examples for valid statements: 

Example 2.2.2. Recall that we have assumed r <m<n. 

• A generic (m x n)-matrix has only non-zero entries. 

• Let Abe any fixed (m x n)-matrix of rank r. A generic (m x n)-matrix of rank r is not equal 
to A 

• A generic (m x n)-matrix of rank r or less has rank exactly r. 

• A generic (m x n)-matrix of rank r has no vanishing (r x r)-minor. 

• A generic (m x n)-matrix of rank r has no real eigenvalues. 

• A generic positive semi-definite real (m x n)-matrix of rank r is positive definite. 

These statements can be proved using algebraic, or probabilistic methods with a proper con- 
ditioning. Note that the use of "generic" is not equivalent to the use of "in general", since, for 
example, a matrix of rank r or less in general needs not to have rank r. Also, note, that a generic 



13 



(m x n)-matrix is not a single, fixed matrix, as generic is not a qualifier for a single, fixed matrix; 
it is, to the contrary, in fact a qualifier for statements about almost all (m x n)-matrices. Similarly, 
it can be thought of as a (m x n)-matrix- valued random variable having the generic sampling 
assumption, about which the sentences make probability one statements. 



2.3 The Algebraic Combinatorics of Matrix Completion 

This section is devoted to developing the basic algebraic structure of Low-Rank Matrix Comple- 
tion, and to state some elementary results which come from the mere fact that Low-Rank Matrix 
Completion is an algebraic problem in the strict sense. Namely, the generative mapping Q M , as it 



occurs in the final problem description 2.1.7 is an algebraic map. This makes the analysis of the 



generative model of Matrix Completion, and its identifiability in view of Theorem 2.1.9| amenable 



to the vast range of tools from commutative algebra and algebraic geometry. A comprehensive 
introduction into basic algebraic geometric concepts can be found in the introductory chapters of 
the book by Cox et al. [8J. Note that knowledge of advanced concepts of commutative algebra, 
or algebraic geometry should not be necessary for understanding the current paper apart from 
some proof details. Part of the following exposition follows the results obtained in the paper by 
Kiraly and Tomioka [22]. 

We motivate the theory which follows with a central question about the identifiability of 
Matrix Completion: 

Question 2.3.1. Given sampling conditions on the true matrices (including the true rank) and a 
fixed mask M: when is the generative model of Matrix Completion identifiable? 



Theorem 2.1.9 states that identifiable of the generative model is equivalent to injectivity of 
the masking Q M , under the given sampling conditions. Thus, the first question which has been 
asked about identifiability of the model is the following: 

Question 2.3.2. Given a fixed mask M, when is Q M injective? 

Injectivity, or one-to-one-ness, is by construction the necessary and sufficient condition for 
properly inverting a map, and thus for identifiability of the generative model. 



For the community, it has been long known that the answer to Question 2.3.2 is very unsatis- 
factory - it is, basically: in all interesting cases, never. The following proposition, which, together 
with its proof, is taken from | |22 || , gives the corresponding formal statement which has already 
been asserted by Candes and Recht Q5J . 

Proposition 2.3.3. Let r > 2, let M be a mask with a = \\M\\i known entries. Then the masking 
Q : M(m x n, r) — * C a is injective if and only if a = mn. 

Proof Clearly, if a = mn; then Q is injective, as it is the identity map. So it suffices to prove: 
if r > 2 and a < mn, there exists a matrix A such that {A} ^ f2 _1 (f2(A)). Now since a < mn, 
there exists an index ij such that M(f2)i ; = 0. Let A be any matrix whose columns, except the 
j-th, span an (r — l)-dimensional vector space. Since X is of (at most) rank r, the set Q _1 (f2(A)) 
contains any matrix A which is identical to A but has an arbitrary entry at the index ij. □ 



This answer to Question 2.3.2 which seems to be strongly negative, is maybe the main rea- 
son which has led the community to believe that in order to obtain firm results on the generative 
model of Matrix Completion, the sampling of the true matrices has to be restricted. As discussed 
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in section 2.2.2 most sampling models from literature put rather strict assumptions on the sam- 
pling of the true matrix. As such, the obtained results usually mix the generative sampling model 
for the true matrix and the mask, which makes in the end unclear which of the two is at the 
source of the observed phenomena. 

2.3.1 The Algebra of Matrix Completion 

The following argumentation, which is naturally and intuitively obtained from the algebraic 
structure of the problem, shows that the strong and unnatural conditioning on the true matrix 
- which can furthermore, by construction, never be verified in the real world - is not necessary 
to obtain identifiability results for the matrix completion problem. Assuming merely generic 
sampling, which means, for any sampling process of true matrices having no support at null sets, 
one can show that identifiability of Matrix Completion depends only on the mask and the true 
rank, and not on the true matrix, or any further sampling assumptions on the true matrix. 

In order to state the result compactly, we need notation for characterizing the degrees of 
freedom one has in the reconstruction: 

Definition 2.3.4. Let A be an (m x n)-matrix of rank r, let Q be an (m x n)-masking in rank r. 
Then, the set f2 _1 (f2(A)) is called the fiber ofQ at A, and, alternatively, the fiber of CI over fi(A). We 
will call the integer 

dim A Q = dimf2 _1 (Q(A)) 
the fiber dimension (ofQ) at A Similarly, we will call 

# A n = #^ _1 (^(A)) 

the fiber cardinality (ofQ) at A 

For given A, the fiber dimension dim^ Q is exactly the number of degrees of freedom one 
has in continuously changing A without changing the masked version f2(A). That is, the fiber 
dimension is the degrees of freedom in the range of Q which do not appear in its image, at the 
element A; more informally speaking, the fiber dimension is the degrees of freedom killed by Q 
in a neighborhood of A. Note that in particular, dim A Q = is equivalent to saying that f2(A) has 
a finite set of possible reconstructions. In this case, # A Q is an integer, else it is oo. 



The following series of observations implies that the invariants introduced in Definition 2.3.4 
are indeed generic and characteristic invariants of the masking. They follow from the fact that 
O is an algebraic map. The most important fact is that for a generic true matrix, its behavior, 
in terms of fiber dimension and number of possible reconstructions, does not depend on the 
particular entries or the structure of the true matrix. These results were first stated and proved 
in ||22||. for completeness, we reproduce the proofs. Candes and Recht []5] have implicitly used 
and assumed some of those, but without proper proofs or references. 

Theorem 2.3.5. Let Abe a generic (m x n)-matrix of rank r, let Qbe a masking in rank r. Then, 
the following depend only on the true rank r and the mask M(f2): 

(i) The fiber dimension dim A Q. 

(ii) The fiber cardinality # A Q. 

(iii) Whether # A Q = 1, i.e., whether A is uniquely completable. 
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Proof, (i) By definition of genericity, if suffices showing that there is a Zariski open dense set U 
in C mxn , such that for all matrices A e U, the set of possible completions Q _1 (f2(A)) has the same 
dimension and cardinality. But this is a direct consequence of the upper semicontinuity theorem 
(see e.g. 1.8, Corollary 3 in H26ID . when applied to the algebraic map f2 : M(r;m x n) — > C a , 
considering that M(m x n, r) is irreducible. 

(ii) In the case of dim A f2 > 0, one has # A Q = oo and the statement is true. If dim^Q = 0, the 
statement is an application of the purity of the branch locus, see [|46|| . 



(iii) is a special case of (ii). 



□ 



Theorem |2 . 3 . 5 1 shows that identifiability properties of £1 are independent of the true matrix A 
as long as it is generically sampled; namely dim A f2 and # A Q are independent from A, so we can 
remove the qualifier A which Theorem 2.3.5 has proved to be unneccessary: 

Definition 2.3.6. Let Q be a (m x n)-masking in rank r, let Abe a generic (m x n)-matrix. We will 
write #Qfor the (generic) value of # A Q, and dim Q for the (generic) value of dim A fL 



Also, Theorem 2.3.5 provides motivation for the following definitions which characterize the 
generative identifiability of £1 generically: 

Definition 2.3.7. Let £1 be a (m x n)-masking in rank r. We call 

(i) SI generically injective and M(il) identifiable or completable (in rank r), if #Q = 1. 

(ii) £1 (generically) finite and M(f2) finitely identifiable or finitely completable (in rank r), if #Q < 

oo. 

(iii) Q generically k-to-one (in rank r), if #Q = k. 

(iv) Q. infinite and M(f2) unidentifiable (in rank r), if dim SI > 0. 

Thus, generic injectivity of Q. means that Q is 1:1 almost everywhere on its range; that is, 
after a restriction to the complement of a Lebesgue null set. Similarly, generic finiteness and 
generic fc-to-one-ness mean being k:l everywhere, with the same k. Theorem 2.3.5 ascertains 



that every masking Q will be either generically injective, generically fc-to-one for some k > 2, or 
infinite. Note that the qualifier in rank r has always to be present for well-definedness. 

For intuition, we illustrate the results in Theorem 2.3.5 with the simplest non-trivial (in the 
sense of Proposition 2.3.3) example: 

Example 2.3.8. Consider the set M(2 x 2, 1) o/(2 x 2)-matrices of rank and 1. It is the set 



M(2 x 2,1) = 



Consider the mask 



a ll a 21 
a 12 a 22 



M = 



^2x2 



> a ll a 22 — a 12 a 21 



1 1 

1 
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The masking given by M is 

ft 



M 



f a ll 


a 21 




< a n a 2i ] 


V a 12 


0-22 




I ai2 J 



We will write ft = ft M for convenience. Let B e M(2 x 2, 1) be some fixed matrix with 

B = 



bn b 2 i 
t>12 b 2 2 



Then, the fiber at B is 



n-\n(B» -[(I]] «22 J e C2><2 ; bllQ22 = bub21 }- 

Note that b i; - are now fixed in that set, while a 2 2 is the free entry. Now, one of the following two 
cases has to happen: 

Case 1: b n ^ 0. Then, 

In this case, dim B D. = 0, and # B Q = 1. 
Case 2: b n = 0. Then, 

7n this case, dim B ft = 1, and # B ft = oo. 



Case 2 is the generic case, as b u is generically non-zero, see the first bullet in Example 2.2.2 



Thus, dim ft = 0, and #ft = 1, so ft is generically injective, while not being injective (also compare 
Proposition 2.3.3). 

The algebraic theory also implies some results on possible degeneracies which may occur, i.e., 
if A is not sampled generically (compare the two cases in Example 2.3.8| ) : 

Theorem 2.3.9. let ft be an (m x n)-masking in rank r. Let B be any fixed (m x n)-matrix of rank 
r or less (not necessarily generic). Then 

(i) dim B ft > dim ft. 

(ii) 7/dim B ft = 0, then # B ft < #ft. 
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Proof. The proof is similar to that of Theorem 2.3.9 (i) follows from a more careful application 
of the upper semicontinuity theorem (see e.g. 1.8, Corollary 3 in II26ID . (ii) from purity of the 
branch locus [46] and the fact that # B Q is upper bounded by the degree of the field extension 
JC(M(m x n, r))/iC(^(M(m x n, r))), which is the same as #n. 

□ 



Note that if dim SI > 0, then Theorem |2.3.9| (i) implies # B £l = oo for arbitrary B. That is, if a 
generic matrix can not be reconstructed from a given masking, no matrix can be reconstructed. 
The similar statement that if a generic matrix can be reconstructed, all matrices can be recon- 



structed, is false, as the proof of Proposition 2.3.3 shows. Furthermore, for a given masking, 



there can exist a matrix having unique reconstruction only if the masking is generically finite. 



We want to remark another important consequence of Theorem 2.3.9 



Remark 2.3.10. We want to note that the results presented in Theorems 2.3.5 and 2.3.9 are not 
specific for the case of Low-Rank Matrix Completion. They are special instances of classical results 
from Algebraic Geometry, which are valid for any well-behaved^ algebraic map. Thus, they are in 
similar form valid for any parametric estimation problem which can be decomposed into an exact 
generative part, given by an algebraic map, and a noise part. 



Theorems 2.3.5 and |2.3.9~ allow to replace Question 2.3.2| which we have seen to be unin- 
formative without further specification, with a question that is equal in spirit and excludes a null 
set of pathological cases. 

Question 2.3.11. Given a fixed mask M, when is M identifiable in rank r? When is M finitely 
identifiable in rank r? 



Theorems |2.3.5| and |2.3.9| show that Question |2.3.11| is well-defined, since the answer de- 
pends only on M and the true rank r. Also note that since generical injectivity implies generic 
finiteness, any condition necessary for generic finiteness is also for injectivity, and any condi- 
tion sufficient for generic injectivity will be for sufficient generic finiteness. In general, generic 
injectivity and generic finiteness do not coincide, as the following example shows. 

Example 2.3.12. The mask 



M 



f o 


i 


1 


1 




1 





1 


1 




1 


1 





1 




V 1 


1 


1 





J 



is generically two-to one in rank 2. 



It is also important to note that the results of Theorems 2.3.5 and 2.3.9| do not hold (set- 
theoretically), when working over a field which is not algebraically closed, for example the real 
numbers E. Thus in particular, over an algebraically non closed field, Question 2.3.11 is in gen- 
eral not well-defined. We give the probably simplest example where behavior over the complex 
and reals numbers diverges: 



6 that is, for any proper morphism of Noetherian schemes X 
teristic zero, and where X is irreducible 



over an algebraically closed base field of charac- 
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Example 2.3.13. Consider the mask 



M = 



f 





1 


1 


1 






1 





1 


1 






1 


1 





1 






1 


1 


1 





J 



from Example \2.3.12 In rank 2, the masking Q M is finitely identifiable, and generically two-to-one. 
Denote by f2 K the restriction of Q M to real rank at most 2 matrices. When considering a generic real 
matrix A of rank 2 as true matrix, then the (set-theoretic) fiber f2^ 1 (f2 K (A)) contains, generically, 
two or no elements. It is not true that it contains generically two elements, nor is it true that it 
contains generically no element. However, even though A is generic real, the fiber over the complex 
numbers f2^ 1 (fJ M (A)) will generically have two elements. 



This behavior is very similar to that of the well-known quadratic equation 

x 2 + bx + c = 0, withb,c<ER 

which has two real solutions if b 2 > 4c and no real solution if b 2 < 4c. Both cases are generic in the 
sense that they have positive Lebesgue measure, as the sets 

G 1 = {(fa, c) ; b 2 > 4c} and G 2 = {(fa, c) ; fa 2 < 4c} 

are not null sets in M 2 = {(fa, c) ; fa, c e M}. The case b 2 = 4c where a single solution occurs is degen- 
erate (and lies in the ramification locus of the parameter map, compare the proof of Theorems 2.3.5 



(ii) ). Over the complex numbers, the equation has, for generic choice of fa, c, always two solutions. 



Example 2.3.13 shows that over the real numbers, there may be several generic behaviors 



for the identifiability of Q, and not only one as in the complex case treated in Theorem 2.3.5 



The sets where different types of generic behavior occur are semi-algebraic subsets of M(m x 
n, r) n M mxn . That is, the sets are cut out by polynomial inequalities in the matrix entries, e.g., 



definiteness (or compare G 1; G 2 in Example 2.3.13). As it is in general hard to distinguish and 
analyze those semi-algebraic subsets properly, we refrain from doing so for the rest of the paper. 
However, Theorems |2.3.5 and 2.3.9| give bounds for identifiability; that is, for some masking, 
being generically finite over the complex numbers is, by Theorem 2.3.9 a necessary condition for 
any real matrix to have a finite set of possible reconstructions. 



2.3.2 The Combinatorics of Matrix Completion 

The generative model Low-Rank Matrix Completion is not only algebraic, but has also deep 
combinatorial features. This was first noticed by [38], drawing parallels to Rigidity Theory (e.g., 
111410 . We develop these connections further, studying a generic degree of freedom heuristics. 

The combinatorial information in each mask is encoded in a bipartite graph associated to it. 
We recall the notions of bipartite graph and its adjacency matrix: 

Definition 2.3.14. A labeled bipartite graph G is a tuple G = (V, W,E), where V = [m] is the set 

of red vertices, W = [n] is the set of blue vertices, and E Q v l x V 2 . The set E is interpreted as the 
set of edges running between V and W. We will denote the set of edges E of a graph G by E(G), and 
its cardinality by e(G) = #E(G). 
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Two bipartite graphs G± = (V 1 , W 1 ,E 1 ) and G 2 = (V 2 , W 2 , E 2 ) are isomorphic is there are a pair 
of bijections a v : V\ — * V 2 and cr w : W x — » W 2 inducing a bisection >-» (o>(i), CwO)) on the 
edge sets. The equivalence classes under the induced relation are isomorphy types or (unlabeled) 
bipartite graphs. 

Given a bipartite graph G = (V, W, E), its transpose G T is the bipartite graph G T = (W, V, E T ), 
where (j, i) e E if and only if(i,j) e E. 



The adjacency matrix of a labeled bipartite graph G = (V, W, £), is the (#V x #W)-matrix M(G), 
where each row corresponds uniquely to an element ofV, and each column corresponds uniquely to 
an element ofW. The entry in row i and column j is 1 if and only e E, else it is 0. 

Note that in all of these definitions, the bipartition of a (labeled or unlabeled) bipartite graph 
is fixed. Informally, labelled bipartite graphs are isomorphic if one can be obtained from the 
other by some relabeling of each vertex class separately (i.e., preserving the bipartition). The 
reason for separating labeled and unlabeled bipartite graphs is that masks correspond to labeled 
bipartite graphs bijectively via adjacency matrices, but generic completability will turn out to 



depend only on the unlabeled bipartite graph associated with the mask (Proposition 2.5.26). 



Definition 2.3.15. Let SI be a masking with mask M(£l). We will call the unique labeled bipartite 
graph G(f2) which has adjacency matrix M(f2) the adjacency graph of £1 We will also write E(f2) = 
£(G(n)) and e(Q) = e(G(f2)). If we start with the mask M = M(£3), we will also denote G(ft) by 
G(M), and similarly replace ClbyM in £(M) = £(fi) and e(M) = e(ft). 

Also, M (G T ) = M(G) T , but G and G T are in general not isomorphic. 

Before continuing, we illustrate the definition of the adjacency graph of a masking by some 
examples: 



Example 2.3.16. Consider the masks from Example 2.1.6 The masks corresponding to the two 
matrices are 

( 1 1 1 A ( 1 1 ^\ 



Mj = 



10 

V i o o ; 



and M 2 = 



10 

V i o i ; 



We now interpret M 1 and M 2 as bipartite adjacency graphs. That is, with Definitions 2.3.14 and 



2.3.15 both graphs G(M fc ),fc =1,2 have three red and three blue vertices: the red vertices are the 
three rows of M k , and the blue vertices are the three columns of M k . An edge is drawn between red 
vertex/row i and blue vertex/column j if and only if M k has the entry 1 at position (t,;)- Thus, the 
graphs G k shown in figure 2.3.16 are the adjacency graphs G k = G(M fc ). 



olumns rows columns rows 



G, 



G-2 



Since a mask M is uniquely represented by its associated graph G(M), Question 2.3.11 
be rephrased into a question on algebraic graph combinatorics: 



can 
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Question 2.3.17. Given a bipartite graph G with adjacency matrix M, what are combinatorial 
conditions on G for M to be identifiable? What are the conditions for M to be finitely identifiable? 



In the following, we give some combinatorial properties on the graph which are sufficient or 
necessary for generic finiteness. This includes an exact characterization for the case of rank one, 
which was already derived by Candes and Recht [5] and Singer and Cucuringu II38II using other 
techniques. 

First we give a sufficient condition that relies on a simple algorithmic procedure, which we 
call r-closure. It corresponds algebraically to subsequently solving minor equations. For ease of 
definition, we first formally define a replacement operation on graphs, which recursively adds 
edges to existing sub-patterns. 

Definition 2.3.18. Let H',H be (bipartite) graphs with the same vertex set. Let M',M be the 
adjacency matrices ofH',H. If M — M' is non-negative (i.e., it is a mask or adjacency matrix), then 
the induced injection of graphs <fi : H' — > H (fixing vertices) is called a closingV] 




The concept of closing will become important in an algorithmic context, where we search for 
subgraphs H' and add all edges which are missing in H. The map <j> is needed to describe exactly 
where the edges are missing, since specification of H' and H does in general not suffice: 

Example 2.3.19. Consider the masks /adjacency matrices 



This induces an injection of graphs H' — * H which is different from the injection one gets when 
replacing M by 



since in one case two edges in the same row are added, in the other two edges in two different rows 
and two different columns are added. 

Definition 2.3.20. Let G be a (bipartite) graph with vertex set V, let <fi :H'^>Hbea closing. We 
define^\a graph cl[0](G), having vertex set V and containing G, together with a closing cl[0] : 
G — > cl[</>](G), by the following properties: 

(i) Every edge e in cl[^>](G) is an image of an edge e' in G under cl[^>], or there exists a subgraph 

F c G isomorphic to H' such that e' connects vertices in the image cl[0](F). 

(ii) For each subgraph F c G isomorphic to H', there is a closure p such that the restriction of 

c\[(j)] to F factors as p o cj). 

7 An alternative (shorter but more technical) definition of a closing is: a graph monomorphism cf> : H' — > H with 
V{H') = V(H) is called closing. 

8 an alternative definition of cl[^](G), is as follows: Take a closing cf> which corresponds to two (m' x n')-masks 
M, M' with M — M' positive. Let G be a bipartite graph with (m x n) adjacency matrix A' . Let A be the unique (m x n)- 
mask with the least number of non-zero entries such that for all row selection matrices P m e C m xm and P n e C n xn 
(i.e., P m , P n are the first m' resp. n' rows of a permutation matrix), the matrix P m APj — M is positive if P m A'Pj — M' 
is positive. Then, cl[0](G) is defined as the graph G(A). 





and 
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Let <p 1 , ...,<p k be closings. Then we write cl°[(f)i, . . ., <fi k ](G) = G, and, by induction, 
d n [ct>](G)^cl[fa]...d[fa] (d"" 1 ,fo](G)) 

for all integers n > 1. 

If it is clear from the context, we let the operations cl" also act on the adjacency matrices instead 
of the graphs. 

Intuitively, the operation cl[</>] takes all instances of H' in G, and adds all missing edges in 
H, in the way the map prescribes it. cl[0 1; . . ., 4> k ] does the same for several closure patterns 
4> 1 , cj) k . Thus, it can be seen that the closing operation cl" does not add any new edges for 
big enough n, which makes the following definition well-posed. 

Example 2.3.21. Let H = K 2 2 foe the complete bipartite graph with two red and two blue vertices 
(often called biclique), and let H' = K^ 2 foe ^2,2 minus one edge. The graph H' ', which we call the 
almost biclique, has the same isomorphism type for any choice of the missing edge, so the notation 
H' = has no ambiguity, and the induced map cj> : H' — * H that adds the missing edge from H' 
to H is canonical. 

Let G be the bipartite graph with adjacency matrix 



Then, 



cl[0](M) 



f 1 
1 

V 1 



M = 



1 A 
1 

; 



( 1 M 
110 

Vi 0) 



and cl 2 [0](M): 



f 1 
1 

V 1 



1 A 
1 

1 ; 



Definition 2.3.22. Let <j) 1 , . . . , <p k be closings. Let N e N foe any integer such that 

d N . . . , fc ](G) = cl^" 1 [4> u 4> k ](G). 

Then, we write cl 00 [fa, . . . , <t> k ](G) = cl N [0 1; . . . , fa](G). The graph cl 00 [fa,..., fa](G) is called 
[4>i, 4> k ]-closure of G. Ifk= 1, we also write ^-closure instead of [cp]-closure. 

For Matrix Completion, the most obvious closure operation is of same type as in Exam- 
ple 2.3.21 i.e., adding missing edges to almost complete bicliques: 

Definition 2.3.23. Denote by K~ +1 r+1 the complete bipartite graph K r+l r+1 minus one edge. Let 
4> : K~ +1 r+1 — * K r+ i r+ i be the canonical inclusion map. 

Let G be a bipartite graph with m red and n blue vertices. Instead of cj) -closure of G, we will also 
say r-closure of G. If cl°° [^>](G) = K m n , then we call G an r-closable (bipartite) graph. 

Intuitively, the r-closure of a graph G is obtained by repeatedly adding single missing edges 
which complete a subgraph G to the complete subgraph K r+1 r+1 . It generalizes transitive closure, 
as the following lemma shows: 

Lemma 2.3.24. The 1-closure of a bipartite graph G is the transitive bipartite closure of G. A 
bipartite graph is 1-closed if and only if it is connected. 
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Proof. It suffices to prove equivalence of 1 -closure and transitive bipartite closure, as the second 
statement follows from that. 

First, we show that the 1 -closure is contained in the transitive closure. This is equivalent to 
showing that every edge contained in the 1-closure is contained in the transitive closure. But that 
follows from the fact that any edge added in the closure process connects vertices in the same 
connected component, since K~ 2 is connected, and the new edge is added between two vertices 
of an already existing K~ +1 r+1 . 

Now we show that transitive closure is contained in the 1-closure; i.e., any edge contained 
in the transitive closure is contained in the 1-closure. As closure is defined by adding edges at 
positions given by subgraphs, it suffices to prove that for trees, by choosing a spanning forest 
of G. A simple inductive argument can then be used to see that every edge in the connected 
component is added via the closure process. □ 

The graph theoretical concept of r -closure now allows to formulate a sufficient graph theo- 
retical condition for finite completability which was already obtained in [22]. We reproduce the 
Proposition and the proof here. 

Proposition 2.3.25. A masking Q is generically injective in rank r if G(£l) is r-closable. 

Proof If G(S1) contains a subgraph isomorphic to K~ +1 r+1 , this means that for a generic matrix A, 
some (r+1 x r+l)-sub-matrixA / is known in 0(A), except for one matrix element - corresponding 
to the missing edge in K~ +1 r+1 . Since A has rank r, the determinant of A' vanishes. As all 
entries but one are known, the vanishing minor condition gives a linear equation on the missing 
entry. The linear equation is non-trivial, i.e., not the trivial equation = 0, since due to the 
genericity of A, the linear and constant coefficients are all non-zero. Thus the linear equation 
allows to uniquely reconstruct A' and thus uniquely determine an unknown entry of A. Now r- 
closability translates to the fact that this process can be repeated until the whole of A is uniquely 
reconstructed. As we have assumed that A is generic, this implies generic injectivity for £1. □ 

We want to mention that r-closability of the associated graph is neither a necessary condition 



for generic injectivity, nor for generic finiteness. The mask from Example 2.3.13 proves the latter, 
as it is not 2-closable. We will now prove by example that r-closability is not necessary for generic 
injectivity: 



Example 2.3.26. The mask 



M = 



f 1 
1 
1 
1 



\^ 



1 
1 






\ 



1 1 
1 1 



11111 



1111 



1 J 



is uniquely completable in rank 3, but not 3-closable. 

However, note: For m, n < 3, generic injectivity and r-closability coincide. For m, n < 5, generic 
finiteness and r-closability coincide. 
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Proof. We can see that M is uniquely completable in rank 3 by first observing that the two missing 
entries in the middle appears in two 4x4 vanishing minor equations, one in the top left corner 
(rows 1 to 4 and columns 1 to 4) and another in the bottom right corner (rows 3 to 6 and 
columns 4 to 7); note that both of these equations are linear with respect to the two missing 
entries, because they appear in the same column. Therefore, the two missing entries lie in the 
solution of a system of two linear equations, which is generically unique. After the two missing 
entries are determined, the mask becomes 3-closable and therefore unique. On the other hand, 
M is not 3-closable because for each missing entry the bipartite subgraph (N(j),N(i),E'), 
where IV (j) is the set of neighbors of j and E' = £(M) n (Nfj) x iV(i)), does not contain a 3 x 3 
biclique. Due to symmetry, we need to check only 5 bipartite subgraphs to see this. 

The statements on finiteness and inactivity follow from an exhaustive search using the algo- 
rithms in section [3] □ 



In section 2.5.3[ we will see, that a necessary condition on generic finiteness can be formu- 



lated in terms of some closure, which in general is not equivalent to some r -closure. 

Now, we prove some necessary conditions for generic finiteness of a masking. First, recall the 
definition of r-connectedness: 

Definition 2.3.27. We say a bipartite graph G = (V, W, E) is r -edge-connected, or r-connected, if 
for any non-trivial vertex partitior^of G into two graphs G 1; G 2 , the set the number of edges running 
between Gj and G 2 is lower bounded by r, i.e., 

e(G)-e(G 1 )-e(G 2 )>r. 

That means, G stays connected after removing an arbitrary set of r — 1 edges. 

We also define an abbreviation for the number of degrees of freedom of (m x n)-matrices of 



rank r, compare Remark 2.3.29 



Definition 2.3.28. For m, n e N, we will write 

d r (m, n) = mn — max(0, m — r) max(0, n — r). 

If G is a graph with m red and n blue vertices, we will also write d r (G) = d r (m, n). 

Note that d r (m, n) = mn if m < r or n < r, else d r (m, ri) = r ■ {m + n — r). 

Remark 2.3.29. The number d r (m, n) is the dimension of the determinantal variety M(m x n, r), 
which is classically known. Intuitively, this is the number of (algebraic) degrees of freedom in the 
set of (m x n)-matrices of rank r (or less). If n < r or m < r holds, then M(m x n, r) = C m ", 
and it directly follows that d r (m, n) = dimM(m x n, r) in that case. In all other cases, d r (m, n) = 
r -(m + n — r). There are several way to prove that this number is the same as dimM(m x n, r), we 
want to give two heuristic arguments (not proofs) why this should be the case. 

First, consider a rank r matrix A of size m x n. We can choose A by first choosing the column 
span, which is a r-dimensional sub-vector V space of n-space. Classically, this choice is known to 
have r(n — r) degrees of freedom, and is parameterized by the Grassmannian Gr(r, n) (the latter is 



9 one has V = V 1 u V 2 , V 1 n V 2 = and W = W 1 UW 2 ,W 1 nW 2 = and E(Gj) U £(G 2 ) c £(G); non-trivial means 
that each of G lt G 2 contains at least one vertex 
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also an algebraic variety, and its dimension is r(n — r)). Then, one can choose each column of A 
from V; since V is r -dimensional, this is only r degrees of freedom for each column, so in total 

r{n — r) + mr = r • (m + n — r) 

degrees of freedom. 

Alternatively, one can write A = UV J with U a (m x r), and V a {n x r)-matrix. There are a 
total of r • (m + n) entries in both U and V, but one can replace a particular choice of U ,V by U • B 

and V ■ (b 1 J , where B is any full rank (r x r)-matrix. There are r 2 degrees of freedom to choose 
B. Since the choice of B are degrees of freedom which do not appear in the choice of A; one has to 
subtract them from those in the choice of U, V, giving a total of 

r ■ (m + n) — r 2 = r • (m + n — r) 

degrees of freedom. 

Note that both arguments do not constitute proofs, as it has to be shown that the degrees of 
freedom added together are not redundant in the first argument, and that there are no other degrees 
of freedom which do not appear in A that could be subtracted. Both arguments give r ■ (m + n — r) 
as an upper bound on the degrees of freedom though. 

Now we state some necessary conditions on generic finiteness: 

Proposition 2.3.30. If a masking Q is generically finite in rank r, then: 

(i) e(ft) > d r (G(ft)) 

(ii) Each vertex of G(f2) has degree at least r 

(iii) G(f2) is r-connected 

Proof, (i) By the dimension formula, it holds that 

dim Q = dim M(m X n, r) - dim f^(M(m x n, r)). 

By definition, Q is generically finite if and only if dim Q. = 0, thus Q is generically finite if and 
only if 

dim M(m x n, r) = dim f2(M(m x n, r )). 

Again, by definition, one has dimJl(M(m x n, r)) < e(f2). Thus, if Q is generically finite, then 

d r (G(Q)) = dimM(m x n, r) = dimfi(M(m x n, r)) < e(ft). 

(iii) implies (ii) in the special case of the partition in the graph in one vertex and the rest. 
We will show that the statement (iii) follows from the more general Proposition 2.3.33 Since 
Proposition 2.3.33| is proved later, note that there are no loops in the proof structure. So assume 
that Q is generically finite, Proposition 2.3.33 (iii) then shows that for any vertex partition of 
G(£l) into two graphs G l5 G 2 , it holds that 

e(G) - e(G a ) - e(G 2 ) > d r (G) - d r (G a ) - d r (G 2 ). 

An elementary calculation shows that the right hand side is always r or greater if the G t are 
non-trivial, thus 

e(G)-e(G 1 )-e(G 2 )>r 
for any vertex partition of G(£l) into G 1 , G 2 , which is the definition of r-connetedness. 

□ 
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The following example proves that the conditions given in Proposition 2.3.30 are not suffi- 
cient: 



Example 2.3.31. Consider the mask 



M 



f 

1 

1 1 
V 1 1 



i i i A 
ill 
ooo 
i i i ; 



M is not finitely completable in rank 2, but 2-connected. In particular, each vertex in G(M) has 
degree at least 2. This is equivalent to the fact that each row and each column of M has at least 
2 non-zero entries. Also, e(Q) = 14 > 14 = r • (m + n — r). Thus, no single of the conditions in 



Proposition 2.3.30 is sufficient for finite completability, nor is their conjunction. 



Example 2.3.31 also shows that r -connectedness is too weak to describe finite completability. 



Namely, if the graphs in a vertex partition, as in the Definition 2.3.27 of r-connectedness, are 
similarly large, the number of edges running between them has to be bigger than r; also, the 
more balanced the partition is, the more edges have to run between the partition components. 

We now introduce a concept of rank-related sparsity which in its dual notion, will be equiv- 
alenly reflecting that fact. Singer and Cucuringu [ 38 ] have already conjectured that some sparsity 
concept might play a role in describing the completable masks. 

Definition 2.3.32. A bipartite graph G is called rank- r -sparse, if for all subgraphs G' Q G it holds 
that e(G') < d r {G'). 

If, additionally, G has exactly d r (G) edges, the graph G is called rank-r-tight. 

We say G is spanned by a rank-r-tight graph if G contains a rank-r-tight graph with m red and 
n blue vertices. Abbreviatingly we also say that G is spanned in rank r. 

Rank r -sparsity is closely related to combinatorial properties defined using partitions of the 
vertices and edges (cf. the Nash-Williams-Tutte Arboricity Theorem 11271 14410 : 

Proposition 2.3.33. Let G be a bipartite graph with least d r (G) edges. Consider the following 
statements: 

(i) G is spanned in rank r. 

(ii) For all partitions of the edges^\of G into subgraphs graphs G-y, G N , 



d r (G)<2d r (Gi). 



i=l 



(iii) For all partitions of the vertices inducing subgraph^\of G into graphs G-y, G N , 

N 

e(G)-d r (G)>2(e(G i )-d r (G l )). 



t=i 



'i.e., if G = (V, W, E) and G t = (V h W h E { ), then E = E 1 u E 2 u ■ • ■ U E N , and E t n E, = for all i, j 

i.e., if G = (V, W, E) and G f = (V h W h E-), then V = V x U V 2 U • •• U V N , and V t n Vj = for all i, j; same for W 
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Then, the implications (i)=>(ii)=>(iii) hold, and each of the three conditions (i), (ii), (Hi) is necessary 
for Q to be generically finite. 



Proof. That (i) is necessary for Q to be generically finite follows from Theorem 2.5.31 since 
Theorem 2.5.31 will be proved later, it is important to note that there are no loops in the proof 
argument. Necessity of (ii) and (iii) follow directly once the implications are proved. 

(i) =>(ii): We show that a graph G which violates the inequality in (ii) cannot be spanned in 
rank r. Let Gi,...,G N some edge partition such that 

JV 

d r (G)>Y i d r (G i ). 

i=l 

Let H be any rank-r-sparse graph contained in G. Denote by h the number of edges of H, and by 

h t the number of edges of H, lying in G ; . By definition, one has h t < d r (G;), and h = h 1 -\ \-h N . 

Inserting into the inequality above gives 

N N 

d r (G)>Y i d r(G i )>Y i h i= h > 

i=l i=l 

which shows that H is not rank- r -tight. 

(ii) =>(iii): Let G 1; . . . , G N be a vertex partition of G, as in (iii). Let H 1 , . . . ,H M be single-edge 
graphs for all the edges not contained in any of the G t . Thus, G 1 ,...,G N ,H 1 ,.. .,H M is an edge 
partition of G. Denote by m t , n ; the number of red and blue vertices of G. Then, by (ii), one has 

JV 

d r (G) < M + Y,d r {Gd- 

i=l 

The condition in (iii) follows from 

JV 

M = e(G)-^e(G ; ) 

i=l 

and elementary arithmetic. □ 

Remark 2.3.34. The conditions (i), (ii), and (iii) can be derived from various heuristics for finite 
completability of a mask: 

(i) To be minimally completable, a mask needs r ■ (m + n — r) total edges by a dimension count, 
and no subgraph should be "overloaded". 

(ii) For any partition, the sum is an upper bound for the maximum size of a rank r -sparse sub- 
graph. 

(iii) In a completable mask, if we replace the pieces of any partition of the rows and columns with 
r x r sub-matrices, the associated contracted graph is also completable. 

Condition (iii) was also proved directly in H21\l : the proof path presented here can be specialized to 
that one. 

After we develop the machinery of completion matroids, we will be able to show that (i) is 



indeed necessary for generic finite completability of a masking (Theorem 2.5.31 1, implying the 
same thing for (ii) and (iii). 
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2.4 Partial Completion 

Matrix Completion, as defined so far, asks whether a low-rank matrix can be completely recon- 
structed from a set of its entries. However, in practical scenarios, e.g., recommendation, or 
prediction, it is more common that one is only interested in reconstructing some of the missing 
entries, not all. Most approaches overlook this fact as they rely on reconstructing the complete 
matrix first. 

We will call this task Partial Low-Rank Matrix Completion, or just Partial Completion. Most 



important observations made in section 2.3 still hold for Partial Completion, for example the 
existence of a zero-measure set of exceptions or that reconstructability only depends on the 
pattern of observed entries. To prove this, we will use the tools from Algebraic Combinatorics 



from section 2.3 One can also decompose the generative sampling model into an algebraic 
part, and a noise part - we leave that to the reader, as it is very similar to what is presented in 
section l2T2l 

First, we want to formally state the problem of Partial Completion. As the problem generalizes 
Matrix Completion, we already start with a refined formulation that includes generic sampling: 

Question 2.4.1. Given an (m x n)-mask M of rank r, and a generic matrix A Which entries of A 
can be reconstructed from the masked matrix f2 M (A)? 



In order do get a formal grasp on Question 2.4.1 we define the analogues of masking for 
Partial Completion: 

Definition 2.4.2. Let N,M be (m x n)-masks such that N — M is a mask. Let £l M ,£l N be the 
corresponding maskings in rank r. The unique map £I n /m : Q. N (M(m x n, r)) — > C a such that 

is called a partial masking in rank r. The bituple (IV, M) is called partial mask and denoted by 
M(r2 JV / M ). If N — M contains exactly one non-zero entry, in the i-th row and j-th column, we will 
also write ((ij), M) for the partial mask. The inclusion map of graphs (G <— » H) such that H has 
adjacency matrix N and G has adjacency matrix M is called graph map ofQ and denoted by G(Q). 

We also will look at one special fiber and define what fiber dimension and cardinality are in 
this case: 

Definition 2.4.3. Let Abe an (m x n)-matrix of rank r, let (iV,M) be a partial (m x n)-mask. Let 
Q N , ^ e ^e corresponding maskings in rank r, let Q w / M be the partial masking in rank r. We 
will call the integer 

dim A n N/M = dimQ-J M (Q M (A)) 
the fiber dimension (of£l N / M ) at A Similarly, we will call 

# a ^n/m = #n-yn M (A)) 

the fiber cardinality (of Q) at A 



The analogue of Theorem 2.3.5 for the Partial Completion setting is: 



Theorem 2.4.4. Let Abe a generic (m x n)-matrix of rank r, let Q be a partial masking in rank r. 
Then, the following depend only on the true rank r and the mask M(f2): 
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(i) The fiber dimension dim A fi. 

(ii) The fiber cardinality # A Q. 

(iii) Whether # A Q = 1, i.e., whether the entries of A masked by Q. are uniquely completable. 



Proof. The proof is completely analogous to that of Theorem 2.3.5 The only additional thing to 
note is that the range of £l N / M > which is r2 w (M(m x n, r)), is irreducible. But that is true since 
Q N (M(m x n, r)) is a projection of an irreducible variety. □ 

The following are the generalized definitions concerning identifiability and generic behavior 
to the Partial Completion setting: 

Definition 2.4.5. Let Qbe a partial (m x n)-masking in rank r, let A be a generic (m x n)-matrix. 
We will write #Qfor the (generic) value of # A Q, and dim Q for the (generic) value of dim A Q. 

Again, the generic values of dimension and cardinality bound all possible values: 

Theorem 2.4.6. let Q be an (m x n)-masking in rank r. Let B be any fixed (m x n)-matrix of rank 
r or less (not necessarily generic). Then 

(i) dim B Q > dimfi. 

(ii) 7/ dim B n = 0, then # B n < 



Proof. The proof is completely analogous to that of Theorem 2.3.9 □ 



Now we introduce the analogues characterizing the generic behavior for Partial Completion: 
Definition 2.4.7. Let D.be a partial (m x n)-masking in rank r. We call 

(i) SI generically injective and M(Q) identifiable or completable (in rank r), if #Q = 1. 

(ii) £1 (generically) finite and M(S1) finitely identifiable or finitely completable (in rank r), if #Q < 

oo. 

(iii) £1 generically k-to-one (in rank r), if #f2 = k. 

(iv) Q infinite and M(S1) unidentifiable (in rank r), if dim Q > 0. 

Remark 2.4.8. Let Q be a partial masking with partial mask (iV,M). Whether Q is generically 
injective, finite, etc. can be checked separately for single entries ofN — M. That is, write 



N - M = y](JV; — M) with masks N t 



i=l 

such that \\N i — M\\i = 1, i.e., N t — M contains only one non-zero entry. Then it can be shown: 

(i) M(Q) is identifiable if and only if^N^M) is identifiable for all 1 < i < n. 

(ii) M(£l) is finitely identifiable if and only if (N U M) is finitely identifiable for all 1 < i < n. 

(iii) M(£l) is unidentifiable if and only if there exists an i such that (iV^M) is unidentifiable. 
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(iv) IfClis generically k-to-one, and the prime factorization of k is k = p x ■ p 2 Pi, then for all 

j there exists an i such that (N h M) is k r to-one and pj divides fc;. 

(v) IfQ N ./ M is generically k r to-one, then Q. is at least lcm(fc ; )-to-one, and at most (OiLi ^i)" t0 " 

one. 

Also, note that it may happen that some masks (IV;, M) are identifiable, while some other masks 
(iVj, M) are finitely identifiable but not identifiable. 

The statements can be proved by applying Galois theory to the fact that ciim.n N / M is the same as 
the transcendence degree of the field extension 

C(ft N (M(m x n, r)))/ft M (M(m x n, r)). 

To a partial mask, one can also associate a graph structure, namely, an injective graph mor- 
phism: 

Definition 2.4.9. Let Qbe a partial masking with partial mask M(f2) = (iV, M). We will call the 
unique injective map of bipartite graph G(£l) = (G <-» H), where G is the adjacency graph of M and 
H is the adjacency graph ofN and the injection identifies vertices, the adjacency map of(N,M). 

Similarly, the adjacency map of a partial masking uniquely characterizes its completion prop- 
erties; however, for a more thorough discussion, the matroidal structure of matrix completion is 
needed which will be developed in the following sections. 

2.5 Differential Analysis of Completion Patterns 

This section is devoted to the analysis of the degrees of freedom in the matrix entries and how 
they interact. In particular, we want to develop tools which allow us to see which entries of a 
mask can be chosen independently, omitted without loss of information, or reconstructed from 
the known ones. 

2.5.1 Degrees of Freedom 

Concerning completability and identinability, it is an natural question which degrees of freedom 
are - in the case of a generically sampled true matrix - contained in the masked matrix. That is, 
how many degrees of freedom get killed by the masking operation, and how to combinatorially 
or algebraically quantify and qualify them. Formally phrased, the question we want to answer in 
this and the following sections is: 

Question 2.5.1. Let n be a masking in rank r. How many degrees of freedom are in its image 
f2(M(m x n, r)), depending on the mask M(f2)? 

Finite completability of a mask can be then rephrased as the fact that the image and the 
range of Q do have the same numbers of degrees of freedom, namely d r (G(f2)) = r -(m + n — r). 
As we have already seen in the previous section, Algebraic Geometry provides tools to bound 
degrees of freedom - the formal concept for that is the (Krull) dimension - and in particular, the 
number of the degrees of freedom which are lost by applying (to a generic matrix) is exactly 
the generic fiber dimension dimfL In this and the following section, we will go a step further and 
develop tools which in the end will alow to combinatorially study and algorithmically determine 
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the exact number of degrees of freedom, in terms of the structure of the map CI, namely the mask 
M(S1), and the true rank r. II38II have already proposed a probablistic algorithm for checking 
finite completability of a mask based on differentials, without giving a proof for its correctness. 
The results of this and the following sections will allow to later fill that gap, and provide more 
general algorithms for the mentioned purposes. 

The main ingredient in the following is a refinement of a classical instrument from calculus, 
the Jacobian matrix. The Jacobian and its generalizations are also classic tools in Algebraic 
Geometry to describe fiber dimension of an algebraic map. We will now describe in which specific 
instance it occurs in the context of Matrix Completion. 



The masking Q is - as it was defined in Definition 2.1.4 - a map 



If A e M(m x n, r), then there exist matrices U e C mxr and V e C nxr such that A = U ■ V T . 
Conversely, given matrices U e C mxr and V e C" xr , the matrix A — U ■ V J has rank at most 
r. This means, reformulated, that the set M(m x n, r) can be parameterized (non-uniquely) by 
C^ m+ "^ xr , via the surjective map 

H : C (m+n)xr = C mxr x C nxr ^ M(m x n, r) 

iy,v)^ u- v T . 

So the composition of maps Qo[i is a map from complex r x (m + n)-space into complex a-space, 
and its fiber dimension can be computed by the Jacobian matrix. That is, write 

^°M= Cfi, •••,/«) 
with algebraic maps Consider the Jacobian matrix 



J(U,V) 



where the derivatives have to be taken over all entries of the matrices U, V, i.e., all with 
1 < i < m and 1 < j < r and with 1 < i < n and 1 < j < r. It is possible to show with 



classical tools from Algebraic Geometry 12 that for A = U ■ V J , the fiber dimension dim A £2 = 
dimf2 _1 (f2(A)), at some fixed matrix A = U ■ V J , equals d,.(m, n) — rkJ(J7, V). Thus, the matrix A 
can be reconstructed, up to finite choice, from f2(A) if an d only if J{U,V) has rank d r (G(Sl)) = 
r ■(m + n — r). This means that finite completability is determined by the rank of the Jacobian at 
generic U, V or at a generic A of rank r. 

Arguments of this type can be refined to yield degrees-of-freedom-statments on any set of 
entries of the matrix A. Namely, to each entry of A, one can associate one row of the Jacobian; if 
one has more than one entry of A, one can calculate the degrees of freedom lying in those by the 
dimension of the span of the corresponding row vectors. The following subsections will be de- 
voted to giving an exposition on the proper mathematical tools from Combinatorial Commutative 
Algebra to do so, and how to apply them. 

12 One has dimt V V )iSl ° n) = dim ([J v)1 u + dim^H, and dim ((7 = r 2 , since one can show that the representation 
U ■ V T is unique up to multiplyication U = UB and V = V (B -1 ) T with an invertible (r x r)-matrix B. On the other 
hand, one has dim^ V )(fi ° = r(m + n) — rk J(U, V) by the Jacobian criterion. 
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2.5.2 Differentials 



In this section, we will develop the basic theory of derivatives and differentials, which is a classi- 
cal tool from commutative algebra to quantify and qualify degrees of freedom and dependencies 
between objects. 

The basic idea which characterizes dependence of algebraic quantities, e.g., the entries of a 
low-rank matrix, is that of algebraic dependence. In the end, we will see that it is also the right 
concept to count the degrees of freedom, as it exposes analogies to linear dependence in Linear 
Algebra. 



Definition 2.5.2. Let K be a field over C, (e.g. the field of rational functiom _ C(T 1 , . . . , T m )) let 
a 1 , . . . , a n S K. Then, a 1; . . . , a n are called algebraically dependent (over C) if there is a non-zero 
polynomia^\f e C[X 1; . . . ,X n ] such that 

f(a 1 ,...,a n ) = 0. 

Else we call the a ; algebraically independent (over C). 

Intuitively, this means that if a 1} ...,a n are algebraically dependent, then knowing some 
n — 1 of the a t implies knowing that the remaining a ; - must be one of finitely many values. 
Alternatively, one can think of an algebraically independent set of n elements carrying one degree 
of freedom each, in total n, while an algebraically dependent set of elements has redundancies 
and strictly less than n degrees of freedom^} We will explain this by the next example and give 
a more precise statement in Proposition |2.5.4[ 

Example 2.5.3. Consider the (formal) polynomials a 1 = X 2 , a 2 = XY, a 3 = Y 2 . The three ai are 
algebraically dependent, since for 

f(X 1 ,X 2 ,X 3 ) =X 1 X 3 — X 2 , 

one calculates that /(a 1 ,a 2 ,a 3 ) = 0. On the other hand, there cannot be a non-zero polynomial 
g{X l ,X 2 ) which evaluates to zero when substituting any two of the a h as there is always one of the 
two a t which contains a variable (i.e., X or Y) which the other does not. 

Now assume there is some truth {X, Y) and we measure some of the a;. For a generic truth 
{X, Y), knowing any two of the a t will allow us to predict the third via /(a n , a 2 , a 3 ) = 0, up to a 

a 2 

finite choice. For example, knowing a x and a 2 , one can recover a 3 = — exactly. On the other hand, 
when one knows a x and a 3 , the recovery of a 2 = ±^/a 1 a 2 is only possible up to sign, i.e., one has 
two choices here. Moreover, knowing only one of the a ; allows no prediction whatsoever on the other 
dp since one degree of freedom remains, either in choosing {X, Y) or any second of the a ; . 

The behavior in Exercise l2.5.3l occurs in fact in all similar situations: 

Proposition 2.5.4. Let K = C(X 1; . . . ,X n ), let a 1 ,...,a k e K. Then, a 1 ,...,a k are algebraically 
dependent over C if and only if for generic x e C n and possible reordering of the indices of the a, the 
values ax(x), . . . , a n _i(x) determine the value a n (x) up to finite choice. 



13 C(T 1) . . . , T m ) is the set of all formal fractional functions / /g, where / and g are in C[T\, ...,T k ], see next footnote 
14 C[X 1; . . . ,X n ] is the set of polynomials in the n variables X 1 ,...,X n and with coefficients in C 

15 



In Algebra, this is formalized by the transcendence degree of the field extension 
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Proof. We prove both directions by a series of equivalences. The fact that a 1; ...,a fc are alge- 
braically dependent over C, by definition, is equivalent to the fact that there exists a non-zero 
complex polynomial 

P : C k -> C 

such that P(a 1; . . . , a fc ) = 0. Since P is non-zero, we can reorder the indices of the a such that 
the polynomial 

P(a l ,...,a n _ 1 ,T) 

is non- trivially dependent on the variable T . So, equivalently for generic x, 

P(a 1 (x),...,a n _ 1 (x),r) = 

where P is not a constant polynomial in T. This is equivalent, for T = a n (x), that a n (x) is 
determined up to finite choice for generic x. □ 



Proposition 2.5.4 shows that algebraic dependency is the proper concept to treat degrees of 
freedom in the Matrix Completion setting; however, as it can be seen from Example |2.5.3 or 
more complicated examples as matrix completion itself, it is not always straightforward how to 
determine the existence of algebraic dependencies, or how to prove their non-existence, when 
given the measurement polynomials. 

The central idea which makes the latter theoretically and also algorithmically feasible is the 
differential study of the polynomials, as already explained in section 2.5.1 Namely, the existence 
of dependencies and their degrees of freedom can be studied by the formal, or by the evaluated 
derivatives of the polynomials. 

For a more concise description, we need to introduce the concept of formal differentials 
and their evaluations first. Here, we adopt an ad-hoc definition of differentials; more natural 
definitions and further results can be found in any introductory book on Commutative Algebra. 

Definition 2.5.5. Let K be afield over C (i.e., C Q K), with multiplication - K . The set of formal 
differentials ofK over C is the set 

WC) = {/-dg;/,ge ]?}/{-}, 

where ~ is the equivalence relation given by 

(i) da = 0forall aeC 

(ii) d(af ) = aciffor all a e C and f e K 

(iii) d(/ + g) = df + dg for all f,gsK 

(iv) d(/ - K g) = g • df +f ■ dgfor all f,geK 

where we write = dO and df = 1 • df for all /eJC. 
A multiplication ■ : £l K /c x K — > £l K /c is defined as 

(f,g-dh)~(f - K g)-dh, 

making f2 K / C a vector space over C. When clear from the context, we will omit ■ and/or - K . 
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Intuitively, the equivalence relation ~ imposes all usual differentiation rules which are com- 
monly known, e.g., for rational functions: 

Example 2.5.6. Let K be the field K = C(X 1 ,...,X n ), i.e., the set of all rational functions in the 
formal variables X 1 ,...,X n with addition and multiplication. Then, 



n 



= \Y i f i dX i ; f^K\. 



■K/C 

{ 1=1 J 

That means, if f = f{X 1 , . . . ,X n ) is any rational function in the X it we can always write df in the 
form 

n 

d/ = J]/ i dX i for some fi^K. 

i=l 

It is also known from basic calculus what the f t are, and that, given f, they are unique. Namely, 

- df 



For example, 



To the applied community, the formal operator dmay also be known as the so-called total derivative. 

If K is a rational function field, this behavior always occurs: 

Proposition 2.5.7. LetK = C(X l3 . . . ,X n ). Then, £I k /q is the n-dimensionalK-vector space, spanned 
by the formal differentials dX h 1 <i <n. Given f e K, there exist unique fx,... ,f n e K such that 



df^fidXi. 



i=l 



Proof. This follows from the uniqueness of the partial derivative of a complex rational function. 

□ 



Definition 2.5.8. The rational functions f t e K from Proposition 2.5.7 are called formal partial 
derivative (of f with respect to X { ) and denoted by 

dx t II - 

If K is a rational function field, the differentials can also be evaluated with respect to some 
point: 

Definition 2.5.9. Let K = C(X 1 , . . . ,X n ). Let f <eK. For P e C n , we define the evaluation of df at 
the point P as 

<^ df 

i=i 1 
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Note that any evaluation yields a vector in the n-dimensional C-vector space, spanned by the 
formal differentials dX t , 1 < i < n. 

The following classical result relates algebraic dependence to linear dependence: 

Proposition 2.5.10. Let K be afield over C, let a 1 , . . . , a n e K. Then, a-y, . . . , a n are algebraically 
dependent if and only if da 1; ... ,da n are linearly dependent 16 in £l K /c (considered as a K-module). 



Proof Since K contains C, the extension K/C is always separable. The statement is, for example, 
implied by HTOl Proposition 16.141. □ 



Example 2.5.11. Let us consider the entries of a (2 x 2)-matrix of rank 1 

A = 



a ll a 21 
a 12 a 22 



where we consider the entries as indeterminates subject to the equation 

/(an, a 12 , a 21 , a 22 ) = a u a 22 - a 12 a 21 = 0. 

Formally, the indeterminates live in the field K = C(a n , a 12 , a 21 , a 22 ). The equation above, by 
differentiating, gives a linear equation 

df = a n da 22 + a 22 da n - a 12 da 21 - a 21 da 21 = 0. 

Since the coefficients of all da^ in this equation are non-zero polynomials, we see that any three 
of the daj ; - are linearly independent. By Proposition 2.5.10 any three of the a i; - are algebraically 
independent. Indeed, if A is a generic matrix of rank 1, then any three of the four a t j can be fixed 
independently, determining the remaining one up to a finite choice. 

We now present the central result which will allow us to algorithmically test algebraic inde- 
pendence of the entries in Matrix Completion: 

Proposition 2.5.12. Let K = C(X 1; . . . ,X m ), let fi,...,f n S K. Let P e C m be generic. Then, 
fi,...,f n are algebraically dependent if and only if d/ x \ P , . . . , df n \ P are linearly dependent vectors 
in the m-dimensional C-vector space spanned by the formal differentials dX 1} dX m . 



Proof. By Proposition 2.5.10 it suffices to prove: df x \ P , . . . ,df n \ P are linearly dependent (over 
C) if and only if d/ l3 . . . , df n are linearly dependent (over K). 

First we prove the if-direction. d/ 1; . . . , df n are linearly dependent if and only if there exist 
A 1; . . . , X n e K, not all zero, such that 

n 

Yih<ifi=0- 

i=l 

Thus, we also have 

n 

J]A ; | P d/J P = 0, 

i=l 



16 Cave: the definition of linearly dependent in f2 K/c , as a if -module, allows for coefficients in K, as opposed to 
coefficients on C. That is, da 1; . . . , da n are linearly dependent, if and only if there exist X lt . . . , X„ el, not all zero, 
such that 2"=i = 0- Ag am note tnat this is different from linear dependence over C. 
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where the A ; | P are not all zero due to the genericity of P. Thus we have proved that df x , . . . , df n 
are linearly dependent. 



Now we prove the only if-direction. We assume dfi, . .., df n are linearly independent (thus, 
n < m) and show that df 1 \ P , . . . , df n \ P are also linearly independent. Since we assumed df 1 , df n 
to be linearly independent, it follows, possibly after some reordering of the indices of f i and X { , 
that the Jacobi polynomial 

f Ml Ml \ 



J := det 



Ml 
V 3X l 



dx„ 



Ml 
3X n J 



is not the zero polynomial. Thus, the evaluation J(P) will be non-zero due to the genericit} 17 of 
P. Thus, 

f 3/i | 



^ J(P) = det 
and linear independence of df 1 \ P , . . . , df n \ P follows. 



Ml\ \ 



8fn I 



dfn I 

dx-JP J 



□ 



Example 2.5.13. Keep the notations of Example 2.5.11 Substituting any rank 1 matrix A with 
non-zero coefficients gives 

a ll a 22 

df = a n da 22 + a 22 da n - a 12 da 21 da 21 = 0. 

a 12 

If an, a i2> a 22 are generically sampled, f will always give rise to a non-zero dependence between the 
da t j. For example, if 

' 2 4 A 
12' 



A - 



one obtains the evaluated equation 

df = 2da 22 + 2da n -4da 2i - da 2i = 0. 

We reformulate the results stated so far by collecting the relevant consequences for Matrix 
Completion: 

Theorem 2.5.14. Let (iV,M) be a partial mask. For 1 < i < m, 1 < ; < n, let a i; be the formal 
variable for the (ij)-th entry of an (m x n) rank r-matrix, i.e., we present the ring of the determinant 
variety M(m x n, r) as 

C[M(mx n,r)] = C[a n , . . . , a ij} . . . , a mn ]/7(M(m x n, r)), 

where 7(M(m x n, r)) 15 the determinantal ideal of rank r. Then, the following are equivalent: 

(i) (IV, M) is finitely identifiable in rank r. 



17 a complex polynomial is non-zero if and only if it evaluates zero almost everywhere; follows, e.g., from the 
Schwarz-Zippel-Lemma, or the fact that algebraic sets have Lebesgue-measure zero 
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(ii) There exists a subset S c £(M) such that the set A = {a i; - ; (ij) e S} is algebraically indepen- 

dent over C, and for any (/c£) e E(N), the set {a w } U A is algebraically dependent. 

(iii) The field extension C{a i} , {ij) e £(N))/C(a ij -, (ij) e E(M)) is finite. 

(iv) There exists a subset S c £(M) such that the set .A = {da i; - ; (ij) e S} is linearly independent, 

and for any e E(N), the set {da ke } UA is linearly dependent (over K = C(a i; -, 1 < i < 
m, 1 < j < n)). 

(v) Let U e C mxr , V e C nxr be generic, letA=U- V J . Let V M = span{da i; - \ A ; (ij) e £(M)}, and 

V N = span{da i; - \ A ; (ij) e £(iV)}. One has V N C y M . 

Proof. Let SI be the partial masking defined by the mask (N, M), i.e., 

D. : ^(MO x n, r)) ^ fi M (M(m x n, r)). 

The generic fiber dimension of Q. is exactly the transcendence degree of the field extension, 

dim CI = dim Cl N (M(m x n, r )) - dim £l M (M(m x n, r )) 

= trdegC(Ojv(M(m x n, r)))/C(Q M (M(m x n, r))). 

Also, since Q w and fl M are projections onto the variables a i; - in the respective edge sets, one has 

C(ft w (M(m x n, r))) = C(ay, (i;) e E(N)) 
C(^ M (M(m x n, r))) = C(a i; -, (ij) e L(M)). 

The equivalence of (i), (ii) and (iii) follows from the above equalities. The equivalence of (iii) and 
(iv) follows from Proposition 2.5.10| the equivalence of (iii) and (v) from Proposition 2.5.12 □ 



Theorem 2.5.14 is in particular also a statement for masks as originally defined in Defini- 
tion 2.1.4 by taking for N the matrix containing only ones. As another important consequence, 



one can immediately state the following: 

Proposition 2.5.15. Let M be an (m x n)-mask. Then, there is a unique biggest (w.r.t. number of 
ones) mask N such that (iV,M) is finitely identifiable. 

Proof. Keeping the notations of Theorem 2.5.14| Ci) and (v), there is a unique biggest set S such 
that the vector space V s = span{da i; \ A ; (ij) e S} is contained in V M . Taking N with edge set 
L(iV) := S, and using the equivalence of (i) and (v), this proves the proposition. □ 



Proposition |2. 5 . 1 5 1 motivates the following definition: 

Definition 2.5.16. Let M be an (m x n)-mask. The unique biggest (w.r.t. number of ones) mask N 
such that (iV,M) is finitely identifiable in rank r is called (finite) completable closure of M in rank 
r. 



The following remark shows how Theorem 2.5.14 (v) can be made into an algorithmic rule 
to determine finite completability: 
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Remark 2.5.17. Generic dependencies between entries of the matrix A can be determined using the 
ajfine paramterization of low-rank matrices: 

Let A G C mx " matrix of rank r or less; then, there exist matrices U e C mxr and V e C nxr 
such that A = UV J , and conversely, any matrix of the form A = UV J has rank at most r. Writing 
Uj, 1 < t < mfor the rows of U and v,-, 1 < i < nfor the rows ofV, one obtains the equation 

a t j = u t vj for alll < i < m,l < j < n. 

Thus, one can consider the elements a t j to be contained in the set of rational functions 

K = C(...,U ip ...,V u ,...) 

in the r-{m + n) variables Uu, 1 < i < m,l < j < r and V ki , 1 <k <n,l < I <r, which correspond 
to the (formal) entries of U and V. Thus, the equation 

r 

a ij = u i v j = X! U ik V jk 
k=l 

gives rise to the differential expansion 

r 

da tj = du t ■ vj + u t ■ dvj = {v jk dU ik + U ik dV jk ) . 

k=i 

Thus, using Proposition 2.5. 12 algebraic dependency for some set ofa^ can be evaluated by choosing 



some random generic value U , V for U, V, and then testing for linear dependency of the r • (m + 
n)-dimensional vectors da i; - |(y 0; Vo) which live in the C-vector space generated by all the formal 
differentials dL/ i; - and dV i; -. 



Remark 2.5.17| together with Theorem 2.5.14 for the case of a non-partial masking, also 



shows correctness of Algorithm 3 by Singer and Cucuringu [38]. 
2.5.3 Completion Matroids 



In section [2.5.2| we have seen that dependence of entries in any low-rank matrix may be checked 
by calculating the vector space spanned by the tangent vectors at a generic low-rank matrix. More 



specific, Theorem 2.5.14 shows that dependent sets of entries of the matrix expose exactly the 
same properties as basis elements of a vector space; algebraically independent entries behave 
like linearly independent vectors in a vector space. In the following, we could in principle use 
this established link to prove dependence and degrees of freedom properties of maskings and 
partial maskings. Instead, we will introduce an abstract concept which bundles the relevant 
combinatorial properties for both linear and algebraic dependencies, the matroid, which will 
then allow to derive and state the results in a concise and more readable manner. 

We begin by axiomatically defining what a matroid is. A matroid generalizes properties of 
independent sets of vectors in a vector space: 

Definition 2.5.18. Let E be a set. A collection 3 of subsets ofE is called matroid (over E), if it fulfills 
the following condition: 

CO 0gJ. 
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(ii) Let J e 3, and I c J. Then J e J. 

(iii) Let I, J e J, iet #7 < #J. Then there ise<Ej and eg I such that I U {e} e J. 

The elements of 3 are called the independent sets (of the matroid). 

Intuitively, a matroid is the collection of all subsets of E which are independent in the domain 
of E (e.g., linearly or algebraically independent). Definition 2.5.18 (ii) states that subsets of 
independent sets are also independent, Definition 2.5.18 (iii) is (resp. implies) a generalization 
of the basis exchange properties of independent sets in vector spaces. 

Example 2.5.19. As stated, sets and subsets of vectors or algebraic elements give rise to matroids: 

(i) Let V = C" be a vector space, and v 1 , . . . , v k e V. Let E = {v 1; . . . , v k }. Then a basic fact from 

Linear Algebra is that 

3 := {i Q E ; I is linearly independent} , 
the collection of linearly independent subsets of S, is a matroid (over E). 

(ii) Let K/C be afield extension, e.g. K = C(X 1} . . . ,X n ). Let a 1; . . . ,a k e K, let E = {a 1; . . . , a k }. 

Then one can prove that 

3:={/C£; 7 is algebraically independent} , 
is a matroid (over E). This is a special instance of algebraic matroids, see ft31\ chapter 5]. 



That (ii) is indeed a matroid can also be seen by Theorem 2.5.14 which in fact gives a one-to-one- 
correspondence to a linear matroid as in (i). 

For reading convenience, we will introduce some of the usual matroid terminology: 

Definition 2.5.20. Let 3 be a matroid over E, let S QE. Then we call 

(i) S independent (w.r.t. 3) if S e 3, else dependent 



(ii) S a circuit if it is minimally dependent 



18 



(iii) B c S a basis of S if it is a maximally independent subset 19 of S. 

(iv) the maximal cardinality of a basis ofS the rank of S and denote it by rk(S). 

If 3 is an algebraic, or linear matroid, we will at times add the qualifiers "algebraic" or "linear" to 
avoid confusion, e.g., algebraically dependent set, or algebraic circuit. 

Matroids capture combinatorially the following facts which are well-known for finite vector 
configurations: 

Proposition 2.5.21. Let 3 be a matroid over E. Then, 
(i) given S Q E, one has rk(S) < #S. 

18 i.e., C c E is called a circuit if C is dependent and there does not exist C'cc such that C' is dependent (w.r.t. 7) 
19 i.e., B e S is called a basis of S if B is independent and there does not exist S d B' 2 B such that B' is independent 
(w.r.t. 3) 
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(ii) given S c E, one has rk(S) = #S if and only ifS is independent. 

(iii) given S QE, one has rk(S) + 1 = #S if and only if S is a circuit. 

(iv) given S c E, a subset B QS is a basis of S if and only if #B = rk(B) = rk(S) 

(iii) given circuits C 1; C 2 Q E, and eeC 1 flC 2 there is a circuit C Q (Ci n C 2 ) \ {e}. 

Proof. The proofs of the statements are elementary and can be found in[31]. (i) to (iii) can be 
found the beginning of section 1.3, (iv) is Lemma 1.2.4., and (v) is Lemma 1.1.3. □ 



Proposition 2.5.21 (i) and (ii) generalize the basis elimination principle from Linear Algebra, 
and (iii) is commonly called circuit elimination. 

One of the most important facts for our algebraic situation is that rank of a set of algebraic 
elements is exactly the number of degrees of freedom it contains: 

Proposition 2.5.22. Let J be the algebraic matroid corresponding to some collection of elements 
E = {a 1 , . . . , a k } over C. For S Q E, the rank rkE is exactly the transcendence degree trdeg(i<C/C), 
where K denotes the extension field C(a ; a e S) o/C. 

Proof. This is implied by the discussion between Examples 6.7.8 and 6.7.9 in II31II and the fact 
that an algebraic matroid is a matroid. □ 

We will now introduce some matroid-related concepts which are unique for the problem of 
Matrix Completion, due to its inherent structures and symmetries: 

Definition 2.5.23. We will denote by £(m x n, r) the set of entries a i; -, 1 < i < m, 1 < j < n of a 
matrix with rank at most r, interpreted as variables over C. 

We will denote by T){m x n, r) the matroid over £(m x n, r) consisting of algebraically inde- 
pendent subsets of £(m x n, r). It is called the algebraic independence matroid of £(m x n, r), or 
determinantal matroid. 

To the elements of both £(m x n, r) and D(m x n, r), we will also refer by their respective indices. 
That is, we will simultaneously consider £(m x n, r) to be the set {(tj) G N 2 ; 1 < i < m, 1 < j < n}, 
and we will simultaneously consider elements of D(m x n, r) to be sets ofbituples. 

For E c £(m x n, r), we will denote by rk r (£) the rank rk(£) of E with respect to D(m x n, r). 
If ' M is a (m x n) mask, we will also write rk r (M) = rk r (£(M)). 

In the following, our goal is to make use of the fact that dependence structure of a low-rank 
matrix does not depend on the ordering of rows and columns; this implies additional structure 
for the determinantal matroid. In the proof of this, we need two lemmata: 

Lemma 2.5.24. Let Q : M(m x n, r) — > C a be a masking. Then, 

rk r (E(f2)) = d r (m, n) — dimQ. 

Proof. This follows directly from the fact that dim Q. is the same as the transcendence degree of 
the field extension 

C(M(m x n, r))/C(n(M(m x n, r))). 
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Moreover, one has 

dim(O) = trdeg(C(M(m x n, r))/C(Q(M(m x n, r)))) 

= trdeg(C(M(m x n, r))/C) - trdeg(C(fi(M(m x n, r)))/C) 
= dim(M(m x n, r)) - rk(E(ft)), 

from which the statement follows. □ 

Lemma 2.5.25. For arbitrary permutation matrices P e <C mxm and Q se C nx ", define a map 

ju(P, Q) : M(m x n, r) -» M(m x n, r) 

Also, (for each m, nj define a map 

T : M(m x n, r) — > M(n x m, r) 
A >— » A T . 

The maps ju(P, Q) and T are well-defined, algebraic morphisms which are isomorphisms. 

Proof Well-definedness follows from the fact that the rank of a matrix cannot increase when 
multiplying with another matrix or transposing. The maps M(P, Q) and T are algebraic mor- 
phisms because the defining rules are algebraic. The maps are isomorphisms, since M(P, Q) o 
m(P" 1 ,Q" 1 ) = id and ToT = id. □ 



Proposition 2.5.26. Let M be an {m x n)-mask. Then the rank rk r (M) depends only 20 on G(M). 
Furthermore, rk r (M) is equal to rk r (M T ). 

Proof Let M and N be (m x n)-masks with maskings Q M and Q w . For the first statement, if 
suffices to prove that if there are permutation matrices P e C mxm and Q e C nxn , such that 
M = PNQ, then rk r (M) = rk r (iV). Consider 

M = N o ju(P, Q), 



where ju(P, Q) is defined as in Lemma 2.5.25 By Lemma 2.5.25 pt(P, Q) is an isomorphism, so 
dim ju(P, Q) = 0, thus it holds that 

dim Q M = dim pi(P, Q) + dim Q N = dim £l N . 



Applying Lemma 2.5.24 shows that rk r (M) = rk r (iV). Similarly, 



n = o m t o t. 



By Lemma 2.5.25 T is an isomorphism, so dimT, thus it holds that 

dim Q M = dim T + dim Q m t = dim £I m t . 



□ 



Applying Lemma |Z5 . 24| shows that rk r (M) = rk r (MT). 

20 I.e., rk,.(M) does not depend on m, n, or the numbering of the vertices induced by the presentation in M, only on 
the unlabelled graph structure given by G(M). 
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Proposition |2.5.26[ together with Proposition |2.5.21| which characterizes the relevant con- 
cepts in term of rank, shows that the following are well-defined: 

Definition 2.5.27. If E is a subset of £(m x n, r), we will denote by G(£) the graph with edge set 
E, where we assume that m,nis minimal and no superfluous, isolated vertices are present. 
Let G be a bipartite graph with edge set E C £(m x n, r). We will say that 

(i) G is an independent graph (in rank r) if E is an independent set in D{m x n, r). 

(ii) G is a circuit graph (in rank r) if E is an circuit in D(m x n, r). 

(iii) H is a basis graph (in rank r) of G if E(H) is a basis of E in D{m x n, r). 

(iv) the r-rank rk r (G) of G is the rank rk(£) of E w.r.t. D(m x n, r). 



Proposition 2.5.26 also relates the completable closure to the concept of graph closure: 

Proposition 2.5.28. Let M be a (m x n)-mask with labeled bipartite graph G, let N be the com- 
pletable closure of M in rank r, with labeled bipartite graph H. Let y : G <— » H be the corresponding 
adjacency map. Then, y is equal to the closing 

cl[(/> 1; ...,4> k ] : G — > cl[0 l5 . . . , fc ](G) = H, 

where the set of is the set of all injections of the form C~ <— * C such that C is a circuit graph, C~ 
is the graph C with one edge removed. 

Proof. Let M, N be masks such that N — M is a mask. The mask (N, M) is finietly completable if 
and only if the map 



n 



N/M 



n N (M,(m x n, r)) -> f2 M (M(m x n, r)) 



is generically finite. The latter is equivalent to the field extension 

(C(njv(M(m x n, r)))/C(n M (M(m x n, r)))) 



being finite. By Proposition |2.5 .22 and the definition of rank, this is equivalent to 

rk r (IV) = rk r (M). 



Thus, by uniquness of the completable closure in Proposition 2.5.15 the mask N is the com- 
pletable closure of M if and only if E(IV) is the biggest superset of £(M) with the same rank. But 
that is the matroid closure of £(M) with respect to D(m x n, r), which can be characterized by 
closing circuits one-by-one, for a definition see chapter 1.4 of II31II . The latter is equivalent to 
the graph closure described above due to Proposition 2.5.26 □ 



Propositions 2.5.28 and 2.5.26 imply that the following definition on unlabeled bipartite 
graphs captures all the information about the generic completablity of masks associated with the 
type- 
Definition 2.5.29. Let G be a bipartite graph. The closing of H = cl[(/> 1; . . . , <fi k ](G) from Proposi- 
tion 



2.5.28 is called the completable closure of G in rank r. 
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Proposition 2.5.26 also allows us to state a Corollary of Proposition 2.5.22 for Matrix Com- 
pletion and Partial Matrix Completion: 

Corollary 2.5.30. (i) Let Q be some masking in rank r, let G = G(Q). Then 

dimft = d r (G)-rk r (G) 

, where rank has to be taken in D(m x n, r). In particular, Q is finitely completable if and only 
(frk r (G) = d r (G). 

(ii) Let O be some masking in rank r, let G = G(f2). Then Q is finitely completable if and only if 

G(£l) contains a subgraph G 1 which is independent in rank r and has e(G') = d r (G). 

(iii) Let Q be some partial masking in rank r, with (G <— » H) — G(£l). Then 

dimQ = rk r (H) - rk r (G). 
In particular, Q is finitely completable if and only if rk r (G) = rk r (H). 

Proof (i) is a reformulation of Lemma 2.5.24| together with the fact that finite completability of 
Q is equivalent to dimQ = 0. 

For (ii), note that (i) implies for finitely completable Q that rk r (G) = d r (G). By Proposi- 



tion 



2.5.21 (iv) and Proposition 2.5.26 this is equivalent to the fact that there is a subgraph G' 



of G with rk r (G0 = e(G') = d r (G). Note that the latter condition implies that G' is an indepen- 
dent graph. 



(iii) follows from Lemma 2.5.24 by applying it to a partial masking (IV, M) and the map 



N/M> 



, while using Q M = Q N i M o Q N and the dimension formula 



dim Q M = dim Q N / M + dim Q 



□ 



Corollary |2.5.30 is more than a mere reformulation of the previous results on fiber dimension, 
since it implicitly references the matroidal structure induced by dependence of the matrix entries. 
Without the matroidal property it is difficult 21 to see that, e.g., adding a single known entry 
removes at most one degree of freedom, generically. More importantly, we can use it to prove: 

Theorem 2.5.31. If a masking Q is generically finite in rank r, then G(f2) is spanned in rank r. In 



particular, all conditions (i),(ii), (iii) from Proposition 2.3.33 are all necessary for Q to be generically 
finite or generically injective. 

Proof. The matroidal property of generic completability implies that it is no loss of generality to 
reduce to the case in which f2 is minimally finitely completable, i.e., when it is finitely completable 
but ceases to be so when any of its known entries are removed. Said differently, this is when Q is 
a basis of the completion matroid. 



Corollary 2.5.30 (i) implies that for any basis Q in the completion matroid, e(G(Sl)) 



d r {m, n). A second application of Corollary 2.5.30 (i) implies that any masking £Y with e(G(Q,')) > 
d r (G(f2 / )) + l contains a circuit in the completion matroid. Since since bases are all independent, 
it follows that every subgraph G 1 of G(f2) has e(G') < d r (G 7 )- This shows that G(f2) is r-sparse 
with d r {m, edges; i.e., it is r-tight. □ 



1 Though not impossible, since it can be seen as a consequence of Krull's height theorem. 
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Given that genetically finitely completable masks must be spanned in rank r, we might won- 
der if this is a sufficient condition. In fact, it is not. The intuition is that it is possible to glue 
r-tight graphs along r vertices to get another r-tight graph, but that finitely completable masks 
glued along less than an r x r block will necessarily have left over degrees of freedom. 

Theorem 2.5.32. Let r > 1. There exist rank-r-tight masks that are not finitely completable. 

Proof. Let G x and G 2 be bipartite graphs, with, respectively, m t and n ; - red and blue vertices, 
associated with minimally finitely completable masks. Assume also that m t and n ; are sufficiently 
large. By Theorem 2.5.31} the graphs G; are r-tight. 



That the average degree in each G ; is less than 2r implies that each of G 1 and G 2 have an 
independent set of size r that is not all red or all blue vertices; select one such independent set 
I 1 and I 2 in each of G 1 and G 2 . Define H to be the graph formed by identifying 7 X and I 2 . 

The graph H has m 1 + n 1 + m 2 + n 2 — r vertices and r(m 1 + n 1 + m 2 + n 2 — 2r) edges. Since 
the G; are rank r-sparse and edge disjoint, H is as well. Thus H is r-tight. 

Now form G[ and G' 2 by adding a new edge e, between a red and blue vertex in each this 
is possible by construction. Define H' to be the graph obtained by identifying the I t in such a way 
that the e i are identified as well. Because the G ; were finitely completable, there are two circuits 
in the completion matroid of H' going through e. Eliminating e shows that there is a completion 
circuit in H'\e = H. Edge counts now show that a basis of H in the completion matroid has strictly 



fewer than d r (H) edges, and Theorem 2.5.31 implies that H cannot be finitely completable. □ 



In particular, the proof shows that there are circuits in the completion matroid that are rank r- 
sparse. We now develop some further properties of circuits in the completion matroid. Theorem 



2.5.31 and the matroidal property imply that while there are circuits with fewer than d r (G) + 1 
edges, they cannot have more. Combined with a degree lower bound for circuit, we can show 
that the number of red and blue vertices cannot be too unbalanced in a circuit, which implies a 
bound on the number of circuit graphs with m red vertices. 

Proposition 2.5.33. A circuit graph in rank r has vertex degrees at least r + 1. In particular, a 
circuit graph with m and n red and blue vertices always has m> r and n> r. 



Proof. Let G = (V, E) be a circuit graph. Theorem 2.5.14 and Remark 2.5.17 imply that the rank 
resp. dimension of the C-vector space, generated by the differentials 

da y = diz ; • vj + u t ■ dvj, (ij) e E 

must be #E — 1, where u t , 1 < t < m and v ; -, 1 < j < n are generic vectors in C r , and the 
components of the r -vectors diZ; resp. dv ; - are formal basis elements. Equivalently reformulated, 
this means that there are A i; - e C, (ij) e E, not all zero, such that 



(ij)GE 

and that none of the can be chosen zero if at least one is non-zero. Using the above represen- 
tation in the basis given by du ; and dv ; , the condition becomes 



°= Z Z-ijfdUi-vJ + Ut-dvj) 



(ij)e£ 



i=l (i;)e£ j=l (tj)eS 
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Since the components of diZ; and dvj form a basis of the vector space of differentials, this implies 
that there are non-zero such that 

= ^ ^ij v ] f° r an Y (arbitrary but fixed) t and = ^ ^ij u J f° r an y ( a -b-f-) j- 

(ij)£E (ij)£E 

Since the u h Vj are generic, and u h Vj e C, this can hold only if iV ; > r + 1 and Nj > r + 1, where 
N t = ; (ij) e E} and = ; (ij) e £} 

(note that the definitions of AT; and Nj implies that i resp. j are arbitrary but fixed). Since AT; and 
AT,- are the vertex degrees of the vertex i resp. the vertex j, this implies that each vertex in G has 
degree at least r + 1, which was the statement to prove. □ 

Proposition 2.5.34. The number of blue vertices in a circuit graph in rank r with m red vertices is 
at most r(m — r) + 1. 

Proof. Let G be a circuit graph in rank r with m red vertices and n blue vertices. As noted above, 



Theorem 2.5.31 implies that e(G) < <i r (G) + 1, and the degree lower bound Proposition 2.5.33 



gives d r (G) = r(m + n — r). Estimating the number of edges in G from below using, again, 



Proposition 2.5.33 we get n(r + 1) < r(m + n — r) + 1. □ 



Corollary 2.5.35. The number of circuits in rank r with m red vertices is at most 2 mr< ^ m r ) +m . 

Bernd Sturmfels and Zvi Rosen have told us they obtained, independently, a similar result 
with a weaker conclusion. In rank one and m — 1 we can give an exact characterization of the 
circuit graphs. 

Proposition 2.5.36. The following statements are true: 

(i) The circuit graphs in rank r = 1 are exactly the cycles. 

(ii) The unique circuit graph in rank r = m — 1 is exactly K m m . 



Proof. By Lemma 2.3.24| and Proposition 2.3.25| the completion matroid is is isomorphic to the 



graphic matroid, which has as its circuits the cycles [31, Proposition 1.1.7]. This proves (i). 



For (ii), Proposition |2 . 3 . 25 1 implies that the almost biclique is finitely completable, and, since 

in any 



2.5.33 



it has d m _ 1 (m, m) edges, independent. Thus K m m is a circuit. By Proposition 
other circuit graph G, every blue vertex must be connected to all of the red vertices, forcing G to 
contain contains a copy of K m m , contradicting minimality of circuits. □ 

2.6 Completability of random masks 

Up to this point we have considered the generic completability of a fixed mask, which we have 
shown to be equivalent to questions about the associated bipartite graph. We now turn to the case 



where the masking is sampled at random, which, by Corollary 2.5.30[ implies that, generically 
this is a question about random bipartite graphs. 
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2.6.1 Random graph models 

A random graph is a graph valued random variable. We are specifically interested in two such 
models for bipartite random graphs: 

Definition 2.6.1. The ErdAls-Renyi random bipartite graph G(m,n,p(m,n)) is a graph on n red 
vertices vertices with each edge present with probability p{m, n), independently. When the context is 
clear we write p = p{m, n). 

Definition 2.6.2. The d-regular random bipartite graph G(m,n,d,d') is the uniform distribution 
on graphs with m red vertices, n blue ones, and each red vertex with degree d and each blue vertex 
with degree d' . 

Clearly, we need md = nd', and notices that when m = n, the Notice that when m = n, the 
d-regular random bipartite graph is, in fact d-regular. 

We will call a mask corresponding to a random graph a random mask. We now quote some 
standard properties of random graphs we need. 

Proposition 2.6.3. (Connectivity threshold) The threshold for G(m,n,p) to become connected, 
w.h.p., is p = 0((m + n) -1 log n). 

(Minimum degree threshold) The threshold for the minimum degree in G{n,n,p) to reach d is 
p = 0((m+n) _1 (logn+d loglogn+co(l))). Whenp = cn, w.h.p., there are isolated vertices. 

(Connectivity threshold) With high probability, G(m,n,d,d') is d-connected. (Recall that we 
assume m<n). 

(Density principle) Suppose that the expected number of edges in either of our random graph 
models is at most Cn, for constant C. Then for every e > 0, there is a constant c, depending 
on only C and e such that, w.h.p., every subgraph of n' vertices spanning ate least (1 + e)n' 
edges has n' >cn. 

(Emergence of the fc-core) Define the k-core of a graph to be the maximal induced subgraph with 
minimum k. For each k, there is a constant c k such that p = c k /n is the first-order threshold 
for the k-core to emerge. When the k-core emerges, it is giant and afterwards its size and 
number of edges spanned grows smoothly with p. 

2.6.2 Completability of incoherent matrices 

The fundamental result in the area of matrix completion, proven independently in the papers 
HE! is 

Theorem 2.6.4. Let A be an incoherent rank r matrix, with r = 0(1). Then, with high probability, 
an ErdA*\s-Renyi mask with p = 0(r n log n) is sufficient to complete A uniquely. 

We note that the conclusion is not that the mask is generically uniquely completable, since the 
(crucial) incoherence assumption is about the underlying matrix A. In the next section, we will 
give a generic version of Theorem |2. 6.4 



In a sense, 2.6.4 is the best possible. There are incoherent matrices with a block diago- 
nal structure such that no sparser sampling can guarantee even finite completability with high 
probability [foil . 
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In the generic case, Theorem 2.5.31 implies that a finitely completable mask requires mini- 



mum degree r. The minimum degree threshold in the ErdA*[s-Renyi model gives a similar lower 



bound. Combined with the methods of Section 2.7 below, we may conclude 



Proposition 2.6.5. Let r be a fixed constant There are constants c and C such that, if p = 
c{n + m) _1 logn then, w.h.p., G(m,n,p) is not finitely completable and if p = C(n + m) _1 logn 
then, w.h.p., G(m, n,p) is finitely completable. 

2.6.3 Sparser sampling and partial completability 

The lower bounds on sample size for completion of rank r incoherent matrices do not carry over 
verbatim to the generic setting of this paper. This is because genericity and incoherence are 
related, but incomperable concepts: there are generic matrices that are not incoherent (consider 
a very small perturbation of the identity matrix); and, importantly, the block diagonal examples 
showing the lower bound for incoherent completability are not generic, since many of the entries 
are zero. 

Thus, in the generic setting, we expect sparse sampling to be more powerful. This is demon- 



strated experimentally in Section 4.2 In the rest of this section, we derive some heuristics for the 
expected generic completability behavior of sparse random masks. We are particularly interested 
in the question of: when are Q(mn) of the entries completable from a sparse random mask? We 
call this the completability transition. We will conjecture that there is a sharp threshold for the 
completability transition, and that the threshold occurs well below the threshold for G(n, m,p) 
to be completable. 

Let c be a constant. We first consider the emergence of a circuit in G(n, n,c/n). Proposition 
2.5.33| implies that any circuit is a subgraph of the (r + l)-core. By Theorem 2.5.31| having a 



circuit is a monotone property, which occurs with probability one for graphs with more than 2rn 
edges, and thus the value 

t r := sup{t : G{n,n,d/n) is r -independent, w.h.p.} 

is a constant. If we define C r as 

C r := sup{c : the (r + l)-core of G{n,n,c /n) has average degree at most 2r, w.h.p.} 

smoothness of the growth of the (r + l)-core implies that we have 

c r+l — — 

where we recall that c r+1 is the threshold degree for the (r + l)-core to emerge. Putting things 
together we get: 

Proposition 2.6.6. There is a constant C r such that, if c < t r then w.h.p., G{n,n,c/n) is r- 
independent, and, if c> d r then w.h.p. G(n, n,c/n) contains a giant r-circuit inside the {r + V)-core. 
Moreover, t r is at most the threshold for the (r + V)-core to reach average degree 2r. 



Proposition 2.6.6 gives us some structural information about where to look for rank r circuits 
in G(n,n,c/n): they emerge suddenly inside of the (r + l)-core and are all giant when they do. 
If rank r circuits were themselves finitely completable, this would then yield a threshold for the 



completability transition. Unfortunately, Theorem 2.5.32 tells us that this is not, in general, the 
case. Nonetheless, we conjecture: 
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Conjecture 2.6.7. The constant t r is the threshold for the completability transition in G{n,n,c/n). 
Moreover, we conjecture that almost all of the (r + l)-core is completable above the threshold. 



We want to stress that the conjecture includes a conjecture about the existence of the threshold 
for the completabilty transition, which hasn't been established here, unlike the existence for the 
emergence of a circuit. The subtlety is that we haven't ruled out examples of r -independent 
graphs with no r-spanning subgraph for which, nonetheless, the r-closure is giant. Conjecture 



2.6.7 is explored experimentally in Sections 4.1 and 4.2 



Our second conjecture is about 2r-regular masks. 

Conjecture 2.6.8. With high probability G(n, n, 2r, 2r) is finitely completable. Moreover, we con- 
jecture that it remains so, w.h.p., after removing r 2 edges uniformly at random. 



We provide evidence in Section 4.2 This behavior is strikingly different than the incoherent 
case. 

2.6.4 Denser sampling and solving minor by minor 

The conjectures above, even if true, provide only information about matrix completability and 
not matrix completion. In fact, the convex relaxation of [3] does not seem to do very well on 
2 r -regular masks in our experiments, and the density principle for sparse random graphs implies 
that, w.h.p., a 2 r -regular mask has no dense enough subgraphs for our closability algorithm in 
section 3.2 to even get started. Thus is seems possible that these instances are quite "hard" to 
complete even if they are known to be completable. 

If we consider denser random masks, then the closability algorithm becomes more practical. 
A particularly favorable case for it is when every missing entry is part of some K~ +1 r+1 . In this 
case, the error propagation will be minimal and, heuristically finding a K~ +1 r+1 is not too hard, 
even though the problem is NP-complete in general. 

Define the 1-step r-closure of a bipartite graph G as the graph G 1 obtained by adding the 
missing edge to each K~ +1 r+1 in G. If the 1-step closure of G is K n n , we define G to be 1-step 
r-closable. We can give an upper bound on the threshold for 1-step r -closability. 

Theorem 2.6.9. There is a constant C > Osuch that, ifp = Cn~ 2 ^ r+2 ^\ogn then, w.h.p., G(n, n,p) 
is 1-step r-closable. 

Proof. Fix r and set p as in the statement. The probability of a specific copy of K~ +1 r+1 appearing 
is p( r+1 ) 2-1 and there are 0(n^ 2r+2 - ) ) potential copies. Since K~ +1 r+1 is its own least probable 
subgraph, we see that if p = Cn~ 2 ^ r+2 - ) the expected number X of edge disjoint copies of K~ +1 r+1 
in G{n, n,p) is at least C / n 2 logn for some absolute constant C 1 depending on C. 

A fundamental result about the number of copies of a small subgraphs ||191 Theorem 3.29] im- 
plies that X is sharply concentrated around its expectation, so, w.h.p, C"~ l n 2 log n < X > 
C"n 2 log n for a constant C" depending only on r and C. 

We now define the £ to be the event that G(n, n,p) is 1-step r-closable. Also define the event 
S to be the event that, in G(n, n,p), no pair of vertices is the "missing" edge in more than 
Dlogrc copies of K~ +1 r+1 , for some sufficiently large constant D. Since both £ and ->"B are both 
increasing events, the FKG inequality (e.g., [19, Theorem 2.12]) implies that Pr {£ } <Pr{£|-i!B}. 
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Using this esitmate we get 



Pr{£} = Pr{£|S}Pr{B} + Pr{£hS}Pr{-iS} 
> Pr{£|S}Pr{'B} + Pr{£}Pr{-.'B} 

Rearranging, we conclude that Pr{£} >Pr{E.\'B}. 

Conditioning on £ and the fact that there are at least C"~ l n 2 log n edge disjoint copies of 
K~ +1 r+1 , we consider the process that reveals each of these copies one at a time. Since, for each 
pair of vertices the probability that the next revealed copy has (i, j) as its missing edge is 
f2(l/n 2 ), and C" is arbitrary, we can increase the probability that any fixed pair is covered by at 
least one copy to 1 — 1 /n 3 and then apply a union bound. □ 

An interesting question is determining the threshold for G(n,n,p) to be r-closable. Experi- 
mentally, it appears that: p = n" 2/(r+2) is, in fact the order of the true threshold; when G(n, n,p) 
is closable it is 0(l)-step closable. 



2.7 Sufficient Sampling Densities for Algebraic Compressed Sensing 

For different conditioned sampling methods of both matrices and masks, the asymptotic behavior 
of completion has been well-analyzed, most notably in the case of uniform sampling of masks 
]3l [6l 11511 . While much is implied between the lines, none of the available literature addresses 
the question of conditioning only on the mask while removing the conditioning on the matrices 
directly. 

In fact, it turns out that in a novel more general algebraic framework, the analytic arguments 
found in the previous work can be modified to provide identifiability results in the setting where a 
point on an algebraic variety is to be reconstructed from a set of random projections. Particularly, 



with Theorem 2.7.8 we will obtain a result which relates the necessary number of observations 
directly to intrinsic properties of the variety (namely, its incoherence, which we will define), 
notably without further conditioning how the point of the variety was sampled. We believe that 
this result is the canonical expression of a general principle in compressed sensing that relates 
the necessary sampling density to properties of the signal space without further assumptions. 



2.7.1 Finiteness of Random Maskings 

In the following, we will examine compressed sensing under algebraic constraints. That is, given 
a signal x e C", where the inclusion into C" is to be considered as a parametric representation of 
the signal, and given an algebraic variety X c C" which describes the compression constraints, 
such that we attempt to reconstruct the signal x from random coordinate projections of 

x, under consideration of the compression constraints X. The main result of this section will 
characterize the sampling density, i.e., the number of random coordinate projections of x needed 
to reconstruct a generic x, in terms of X, without further sampling assumptions on x. 

As a corollary, we will obtain upper reconstruction bounds for Matrix Completion, where x is 
a low-rank matrix, and X is the variety of low-rank matrices M(m x n, r). 

First we introduce some formal concepts which describe the setting of compressed sensing 
under algebraic constraints, in particular the sampling process which we will assume to randomly, 
independently and uniformly sample coordinate projections of the signal without repetition. 
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Definition 2.7.1. Let X c C" be an algebraic varieties. Fix coordinates (X x , . . . ,X n ) for C". Let 
S(p) be a the Bernouilli random experiment yielding a random subset of . . . ,X n } where each X { 
is contained in S(p) independently with probability p. We will call the projection map 

Q-.X^Y 
1; . . . , x n ) -> (. . .,x u . . . : X t 6 S(p)) 

ofX onto the coordinates in S(p), which is an algebraic-map-valued random variable, an algebraic 
random masking ofX with selection probability p. 

Intuitively, Q. takes a signal x from the signal space C n , fulfilling the constraints in X, and 
independently samples a Bernoulli set °f random coordinate projections with sampling 
density p. 

The constraints in X will play a crucial role in determining the necessary sampling density 
which allows reconstruction of the signal. Namely, the central property of X which will determine 
the necessary density is the so-called coherence, which describes the degree of randomness of a 
generic tangent plane to X; intuitively, it can be interpreted as the infinitesimal randomness of a 
signal. 

Definition 2.7.2. Let H be a k-fla^\ in C n . Let ? : C" -» H c C" the orthogonal projection 
operator onto H, let e 1 ,...,e n a fixed orthonormal basis ofC n . Then the coherence ofH with respect 
to the basis e 1 ,...,e n is defined as 

coh(H)= max ||T(e^ - ^O)!! 2 

l<i<n 

The coherence of a fc-flat is bounded in both directions: 
Proposition 2.7.3. Let H be a k-flat in C n . Then, 

k 

- < coh(H) < 1, 
n 

and both bounds are tight. 

Proof. Without loss of generality, we can assume that e H and therefore that 7 is linear, since 
coherence, as defined in Definition 2.7.2 is invariant under translation of H. 

First we show the upper bound. For that, note that for an orthogonal projection operator 
5 : C" -» C" and any x e C", one has ||y(x)|| < ||x||. Thus, by definition, 

coh(H) = max ||:P(ev)|| 2 < max \\eA\ 2 = 1. 

l<i<n l<i<n 

For strictness, take H as the span of e x > • • • > e k- 

Let us now show the lower bound. We proceed by contradiction. Assume ||P(e ; )|| 2 < - for all 
i. This would imply 

k = n-->Y\\ne i )\\ 2 = \\n 2 F = k 

which is a contradiction, where in the last equality we used the fact that orthonormal projections 
onto a fc-dimensional space have Frobenius norm k. 

The tightness of the lower bound was asserted in [3], shortly after Definition 1.2. □ 



22 A fc-flat is a linear subspace of dimension k which does not necessarily contain 0. Other names are affine subspace 
or affine linear variety. 
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A similar definition of coherence, as in Definition 2.7.2 was used by Candes and Recht [3 



we decided to remove dimensional normalization in order to make the definition more intrinsic, 
i.e., not to depend on the dimension of the embedding. For completeness, we also state the 
original concept: 

Definition 2.7.4. Let H be a k-flat in C n . The normalized coherence, or coherence in the sense of 
Candes and Recht [3], is the quantity | coh(H). 

In our definition, one always has that coh(H) < 1, possibly attaining the upper bound, while 
the normalized coherence has 1 as a possibly attainable lower bound. A normalized version of 



Proposition 2.7.3 was implicitly stated, but not proved in [3J. While it should be straightfor- 
ward, we decided to state it nevertheless since it will play an important role in proving Proposi- 
tion 



2.7.10 which allows to apply the main results of this section to Matrix Completion. 



Definition 2.7.5. LetX be a complex algebraic variety of dimension d (affine or projective) . LetX* 
be the dual variety 23 ofX. The coherence as given in Definition 2.7.2 defines a continuous function 



coh:X* 
[H] 



[0,1] 
coh(H). 



We define the infimum of coh on X* to be the coherence ofX, and denote it by coh(X). 



Note that if X is a fc-flat, then the definitions of coh(X), given by Definitions 2.7.2 and 2.7.5 
agree. Also, if X is projective, then X* is compact, so the infimum is in fact a minimum. These 



observations, together with Proposition 2.7.3[ imply 



Proposition 2.7.6. LetX be a complex algebraic variety in C n . Then, 

1 

- dimX < coh(X) < 1, 
n 

and both bounds are tight. 

Definition 2.7.7. A complex algebraic variety X is called maximally incoherent if 

1 

coh(X) = - dimX. 
n 

The following theorem relates the coherence of a a variety X to the sampling density of a 
generic (constrained) signal xeX, which is needed to achieve reconstruction of x, up to finite 
choice. The proof integrates some ideas of Candes and Recht [3] into our general algebraic 
setting. Also, the proof uses two lemmata, namely Lemmata 2.7.12 and 2.7.13| which can be 
found at the end of the section. 

Theorem 2.7.8. Let X c C" be an irreducible algebraic variety, let Q be an algebraic random 
masking with selection probability p, let x eX be a smooth point. There is an absolute constant C 
such that if 

p > C ■ A • cohpf) • log n, with X > 1, 
then Q is generically finite with probability at least 



l-3n 



-A 



23 For a variety X of dimension d, the dual variety is the set of tangent d-flats of X, which is known to be an algebraic 
variety. More exact, the tangent d-flats at non-singular points of X form a relatively Zariski open set in some (affine 
or projective) Grassmannian, its closure is an algebraic variety. 



51 



Proof. Without loss of generality we can assume that X is projective and thus compact (e.g., 
by using Chow's lemma) . Thus, there exists x e X such that for the tangent space T X X at x it 
holds that coh(T x X) = coh(X). Now let y = £Xx), note that y is a point-valued discrete random 
variable. By the equivalence of the statements (iv) and (v) in Lemma 2.7.12 it suffices to show 
that the operator 

Z = ||p -1 0odn-id|| 

is contractive, where 6 is projection, from T y onto T x , with probability at least 1 — 3n~^ under 
the assumptions on p. Let e x , . . . , e n be the orthonormal coordinate system we choose for C", and 
IP the projection onto T x . Then the projection 6 o d£l has, when we consider T x to be embedded 
into C", the matrix representation 



where £; are independent Bernoulli random variables with probability p for 1 and (1 — p) for 0. 
Thus, in matrix representation, 



By Rudelson's lemma 2.7.13 it follows that 



E(Z) < C 



logn 



maxlirCe^ 



for an absolute constant C provided the right hand side is smaller than 1 . The latter is true if and 
only if 

p > C~ 2 logn max ||IP(e;)|| 2 . 

i 

Now let 5 > 0, and let U be an open neighborhood of x such that coh(r y X) < (1 + <5)coh(X). 
Then, one can write 



Z = sup 



with a countable subset U' £ U. By construction of U' ', one has 



(yi, Hed) (y 2 , <P~\i + 5) coh(x). 

Applying Talagrand's Theorem 9.1 from [3], one obtains 

PQ\Z - E(Z)|| > < 3 exp [~ log (l + ^) ) 
with an absolute constant iC and B =p _1 (l + 5)coh(X). Since 5 was arbitrary, it follows that 

p(l|z - E(z)ll>t)<3E!p (-^) log ( 1+ D)- 

Substituting p = C ■ X' ■ cohpf) • log n, and proceeding as in the proof of Theorem 4.2 in [3 J (while 
changing absolute constants), one arrives at the statement. □ 
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Corollary 2.7.9. Keep the notations of Theorem 2.7.8 IfX is moreover maximally incoherent, and 

pn> C ■ A • dim(X) • log n, with A > 1, 
then Q is generically finite with probability at least 

1 -3n~ x . 

Proposition 2.7.10. M(m x n,r) is maximally incoherent. 

Proof. Let Ae C mxn be any matrix of rank r or less, with A = UV T and U e C mxr , V e C nxr . Let 
H be the tangent space to M(m x n, r) at A, and H v resp. H v the row-spans of U resp. V. The 
calculation leading to [3, Equation 4.9] shows that 

coh(H) = cohCHy) + coh(Hy) - cohCH^) coh(H y ). 

Now for any pair of r -flats H v and H v in m-resp. n-space, there exists an A as above; on the other 
hand, Proposition 2.7.3 shows that there exist H U ,H V such that 

r r 
cohfHy) = — and coh(H v ) = -. 

m n 



Thus, substituting, this implies that there exists H with 



coh(H) = coh(Hy) + coh(H y ) - cohCHy) coh(H v ) 



r -{m + n — r) 



mn 



Since coh(M(m x n, r)) < coh(H) for any such H, this implies together which the lower bound 

r ■ (m + n — r) 



from Proposition 2.7.6 that 



coh(M(m x n, r)) 



mn 



□ 



Corollary 2.7.11. Let M be an Erdds-RA(c)nyi random mask of size (m x n) and sampling proba- 
bility p. There is an absolute constant C such that if 

r • (m + n — r) 

p>C-X- • (logm + logn) , with A > 1, 

mn 

then Q is generically finite with probability at least 

1 -3(mn)" A . 



Finally, we state the lemmata which were used in the proof of Theorem 2.7.8 The first 
lemma relates local injectivity to generic finiteness and contractivity of a linear map. It is related 
to Corollary 4.3 in Q3D . 
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Lemma 2.7.12. Let <p : X —> Y be a surjective map of complex algebraic varieties, let xel, and 
y = (/?(jc) be smooth points ofX resp. Y. Let 



dip : T x X^T y Y 

be the induced map of tangent spacer 24 Then, the following are equivalent: 



(i) There is an complex open neighborhood U ^ x such that the restriction ip : U — » ip(U) is 

bijective. 

(ii) d ip is bijective. 

(iii) There exists an invertible linear map 6 : T y Y — > T X X. 

(iv) There exists a linear map 6 : T y Y — * T X X such that the linear map 

o dip — id, 



where id is the identity operator, is contractive 



25 



If moreover X is irreducible, then the following is also equivalent: 
(v) ip is generically finite. 

Proof, (ii) is equivalent to the fact that the matrix representing d ip is an invertible matrix. Thus, 
by the properties of the matrix inverse, (ii) is equivalent to (iii), and (ii) is equivalent to (i) by 
the constant rank theorem (e.g., 9.6 in 113411 ). 



By the upper semicontinuity theorem (1.8, Corollary 3 in 112611 ). (i) is equivalent to (v) in the 
special case that X is irreducible, the reasoning is completely analogous to the proof of Theo- 
rem [233J 

(ii)=> (iv): Since dtp is bijective, there exists a linear inverse 9 : T y Y — * T X X such that 
$ o d(/j = id . Thus 

0od(y9-id = O 
which is by definition a contractive linear map. 

(iv)=> (iii): We proceed by contradiction. Assume that no linear map 6 : T y Y — * T X X is 
invertible. Since y? is surjective, dip also is, which implies that for each 6, the linear map o d(p 
is rank deficient. Thus, for every 6, there exists a non-zero a e Ker 6. By linearity and surjectivity 
of df2, there exists a non-zero /3 e T X X with df2(/3) = a. Without loss of generality we can assume 
that ||/3 1| = 1, else we multiply a and by the same constant factor. By construction, 

|| [0°d^-id](/3)|| = 110(00-011 = 11011 = 1, 

2A T X X is the tangent plane of X at x, which is identified with a vector space of formal differentials where x is 
interpreted at 0. Similarly, T y Y is identified with the formal differentials around y. The linear map d^ is induced 
by considering ip(x + dv) = y + dv' and setting d^(dv) = dv'; one checks that this is a linear map since x,y are 
smooth. Furthermore, T X X and T y Y can be endowed with the Euclidean norm and scalar product it inherits from the 
tangent planes. Thus, dip is also a linear map of normed vector spaces which is always bounded and continuous, but 
not necessarily proper. 

25 A linear operator A is contractive if ||.A(x)|| < 1 for all x with \\x\\ < 1. 



54 



so cannot be contractive. Since 9 was arbitrary, this proves that (iv) cannot hold if (iii) does 
not hold, which is equivalent to the claim. □ 

The second lemma is a consequence of Rudelson's Lemma [33] for Bernoulli samples. 

Lemma 2.7.13. Let y 1 , y M be vectors in M", let s 1 ,...,e M be Ltd. Bernoulli variables, taking 
value 1 with probability p and with probability (1 — p). Then, 



E 



M 

' / fc 'i \ 



<C\I~^— max 
lf p i<;<m 



i=i 

with an absolute constant C, provided the right hand side is 1 or smaller. 

Proof. The statement is exactly Theorem 3.1 in O, up to a renaming of variables, the proof can 
also be found there. It can also be directly obtained from Rudelson's original formulation in 1 1 3 3 1 1 
by setting substituting -7=/; in the above formulation for y t in Rudelson's formulation and upper 
bounding the right hand side in Rudelson's estimate. □ 

2.7.2 Finiteness of Random Projections 



The results of section 2.7.1| in particular Theorem 2.7.8 might lead to the belief that the log-factor 



in the number of samples always, or almost always necessary for identifiability in terms of the 



chosen projections. That, however is not true. While Theorem 2.7.8 gives a bound which is valid 
for any coordinate system and the coherence definition associated to it, the following theorem 
states that for a general system of coordinates, a much lower bound and a stricter statement is 
true: 

Theorem 2.7.14. Let X c C" be an irreducible algebraic variety, let Q be the projection onto a 
generic k-flat. Let x el be a smooth point. Then, 
Q is generically finite if and only if 

k > dim(X), 

and Q is generically injective if 

k > dim(X). 

Proof. The statements above are more or less folklore; they follow from the more general height- 
theorem-like statement that 

codimpf n H) = codimpT) + codim(H) = codimpT) + n - k, 

where H is a generic fc-flat, a proof of which can be found for example in the Appendix of II23II . 
Then, the first statement about generic finiteness follows by taking a generic y 6E Q(X) and 
observing that f2 -1 (y) = H (II where H is generic if k < dim(X). That implies in particular 
that if k = dimpO, then the fiber £l -1 (£Xx)) f° r a generic x e X consists of finitely many points, 
which can be separated by an additional random projection, thus the statement about generic 
injectivity follows. □ 
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Intuitively, Theorem 2.7.14 can be interpreted in two ways. On one hand, it means that any 
point on X can be reconstructed from exactly dim(X) random linear projections. On the other 
hand, it means that if the chosen coordinate system in which X lives is random, then dim(X) 
entries in the mask suffice for (finite) identifiability of the map - no more structural information 
is needed. In view of Theorem 2.7.8 this implies that the log-factor and the probabilistic phe- 
nomena in identifiability only occur when the variety X is in a sense degenerate with respect to 
the chosen coordinate system, or, in other words, intrinsically aligned. 



3. Algorithms 

3.1 Randomized Algorithms for Completability 

In the following, we describe some algorithms which can be derived from the theory of differ- 



entials and matroids in section 2.5 The algorithms in this section answer the question which 



entries of a rank r matrix can be in principle computed from the given ones. As Theorems 2.3.5 



and 2.4.4 show, for a generic matrix this depends only on the position of the known resp. mea- 
sured entries, encoded in the so-called mask, and not on the values of the entries. Calculations 
in the vector space of evaluated differentials then allow to simply determine the entries which 
can be reconstructed up to finite choice. 

First, with Algorithm [TJ we present a randomized algorithm which checks whether all miss- 
ing entries can be reconstructed up to finite choice; in II38II . a very similar algorithm was already 
conjectured to be correct. In step | a(mx r)-matrix U and a (n x r)-matrix V are sampled 
from a generic probability distribution. One can show (see for example the genericity section 
in the appendix of II23ID continuous probability distribution, e.g., any Gaussian distribution on 



the matrices, will be generic and fulfill the properties of Definition 2.2.1 Thus, U - V will be 
a generic (m x n)-matrix of rank r. In step [2} the differentials of A for all its known entries are 
calculated. The differentials da i; are contained in the formal C-vector generated by all d[/ i; - and 
dViji thus they can be conveniently represented as a (rm + rn)-vector, or a (r x m + n)-matrix. In 
step[3j their span is calculated. By checking their span in step [4] and its dimension, e.g., numeri- 
cally, one can decide whether M was finitely completable, which follows from the equivalence of 



Theorem 2.5.14 (i) and (v), and Corollary 2.5.30| proving correctness of Algorithm 111 



Algorithm 1 Finite completability in rank r. 

Input: An (ra x n) mask M. Output: Whether M is finitely completable in rank r. 
l: Randomly sample [/eC mxr ,V eC" xr . 
2: For all (ij) e E(M), calculate 

r 

da tj ■=Y J {y j k^U i k + U ik dV jk ) 
k=l 

where the <\U i j,dV i j are to be considered as formal basis vectors of a (rm + rn)-dimensional 
vector space. 
3: Set V M = span{da i; - ; (ij) 6 E(M)}. 

4: If dim(V M ) c = r • (m + n — r), return "finitely completable", else "not finitely completable". 
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Algorithm 2 Matroid rank. 

Input: An (ra x n) mask M. Output: The matroid rank of M in rank r. 

1: Randomly sample U i C mxr , V i C nxr . 
2: For all (ij) e £(M), calculate 

r 

day:=£(Vj fc dl7 ifc + l7 ifc dV ;fc ) 
fc=i 

where the dt/^-, dV i; - are to be considered as formal basis vectors of a (rm + rn)-dimensional 

vector space. 
3: Set V M = span{da i; - ; (ij) e E(M)}. 
4: Return rk r (M) = dim c V M . 



Similar principles can be used to calculate the number of degrees of freedom contained in a 
set of given entries of a matrix. Theorem 233] again shows that, generically, it does only depend 
on the position of the entries. In Algorithm [2] we perform the same steps as in Algorithm[l] up to 
step [4j where we give back the (numerical) dimension of the span instead checking whether it 
equates to r -(m + n — r). Indeed, Algorithm [l] can be obtained as the algorithm which just checks 
whether rk r (M) = r • (m + n — r). The correctness of Algorithm [2] follows from the equivalence of 
Theorem 2.5.14| (i) and (v), and Proposition Proprrkdeg. Corollary 2.5.30 exhibits the relation 
to Algorithm [T] Also note that since many objects in matroid theory like circuits, independence, 
bases, etc., can be characterized by an evaluation of the rank - compare Proposition 2.5.21 



Algorithm [2] can be in fact used to classify or find such objects and determine their properties. 
For example, Algorithm [2] can be used in classical matroid theoretical algorithms to find or count 
circuits, bases, or for determining the structure of the whole determinantal matroid. 



Algorithm 3 Completable closure. 

Input: An (m x n) mask M. Output: The completable closure N of M in rank r. 
l : Randomly sample [/eC mxr ,VeC nxr . 
2: For all (ij), calculate 

r 

da i; := Yt{ v Jk du tk + U ik dV Jk ) 
fc=i 

where the dL/ i; -, dV i; - are to be considered as formal basis vectors of a (rm + rn)-dimensional 

vector space. 
3: Set V M = span{da i; - ; (ij) e £(M)}. 
4: For each (ij), calculate whether da^- G V M . 
5: Define N as E(N) = {(ij) ; da u e V M }. 



The randomized strategy, in its most general setting, allows to compute the set of all entries 
which are in principle reconstructible, up to finite choice, from the known entries; Algorithm [3] 
can be used to do that. Steps [l] to [3] are analogous as in Algorithms [l] and [2] with the small 
distinction that in step [2| all differentials are computed, since they correspond to the entries, 
and on has to check for all entries whether they can be reconstructed or not. In step [2] one 
numerically then checks for each entry whether is differential is in the span of those in the mask 
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or not; an entry is reconstructible if and only it is contained in the span, which follows from the 
equivalence of Theorem 2.5.14 (i) and (v). Algorithm [l] can be seen as as special and simplified 
case of Algorithm [3] where it is only checked whether the completable entries are all entries, or 
not. Note also that Algorithm [3] can be applied to determine whether a partial mask (IV, M) is 
finitely completable, by checking whether G(iV) lies in the completable closure of G(M) (i.e., 
whether completable closure of M minus N is non-negative) . 



3.2 Algorithms for checking Closability 

An algorithm for one step r -closure is shown in Algorithm [4| Roughtly speaking we look though 
each missing edge (i, j) e V x W\E and checks whether it can be closed by known edges. This 
can be done by finding neighbors J = iV(t) of i in W and neighbors I = N(j) of j in V, and 
checking if the subgraph (7, J, I x JnE) contains an r x r bi-clique. This is shown in Algorithm[6j 
Since r x r clique cannot contain any vertex with degree less than r, we prune these vertices 
beforehand; this is shown in Algorithm |5j 

Algorithm 4 CloseOneStep((V, W, E), r) 
Inputs: bipartite graph (V, W, E), rank r. 

Output: Associative array C:(i, j) — > (/, J) where (i, j) e I x J\E and /CV,JCW such that 
(i', /) e E for all (i',/) e(JU {i}) x ( J u {;}) except ({',/) = ({,;). 
for each missing edge (t,j) in V x W\E do 

Let / ^ iV(j), «- N(0, and E'<-/xJfi£. 

(I,J,E') «-PruneNodesWithDegreeLessThan((7,J,£ / ),'"). 

if |7| < r or |J| < r then 
Continue. 

end if 

(J'.JO «-FindAClique((J,J,E / ),r). 
if |7'| >0and |J'| > then 

end if 
end for 



Algorithm 5 PruneNodesWithDegreeLessThan((7, J,E), d) 
Inputs: bipartite graph (7, J, £), minimum degree d. 
Output: pruned bipartite graph (7, J, £). 
while 3l' CJorj'cj with maximum degree less than r do 

I<-I\I'. 

J *—J\J'. 
end while 
E^I xjnE. 



One can decide if a bipartite graph (V, W, E) is r-closable or not by repeatedly applying Algo- 
rithm [4] and checking if the graph is a complete bipartite graph when there is no more edge to 
add; this is shown in Algorithm [7j 
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Algorithm 6 FindAClique((7, J,E), r) 

Inputs: bipartite graph {I, J, E), size of the bipartite clique to be found r. 
Output: vertex sets of a clique (7, J), 
if |7| < r or |J| < r then 

return (I, J) <— ({}, {}). 
end if 

if r = 1 then 

return (7, J) <— ({i}, {}}) with any (i, j) e E. 
end if 

(7, J,E) ^- PruneNodesWithDegreeLessThan((7, J,E),r). 
for each (i, j) e £ do 

/' - J' - N(i)\{j}, £'«-J'x/n £. 

(J', JO «- FindAClique((7 / , J', £0, r - 1). 

if | J' | >0and \J'\ > then 

return (7, J) <- (7' U {£}, J' U {;}). 

end if 
end for 

return (7,J)^ ({},{}). 



Algorithm 7 IsClosable((V, W,E),r) 
Inputs: bipartite graph (V,W,E), rank r. 

Output: binary (true means closable, and false means not closable). 
repeat 

n <— iFl 

"-non— zero \ J -'\- 

C <- CloseOneStep((y, W, E), r). 

£^EUkeys(C). 
until n non _ zero = \E\ 
Return true if E = 7 x J else /aZse. 
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3.3 Algebraic Reconstruction of Matrices 

The algorithm described in the previous subsection can be used to actually perform matrix com- 
pletion. Each entry in the associative array C : — * (I, J) provides a valid (r + 1) x (r + 1) 
vanishing minor condition, which we can exploit to fill one missing entry. Our algorithm is im- 
plemented in a depth-first manner to minimize propagation of numerical erros. The details are 
described in Algorithm [8] 

Algorithm 8 CompletionByClosure(A, (V, W,E),r) 

Inputs: Partially observed matrix A, bipartite graph (V, W, £), rank r. 
Output: Completed matrix A, list of associative arrays C save . 

Csave [] • 

repeat 

^■non-zero * l-^l* 

C «- CloseQneStep((V, W, E), r). 

Csave * [Qsave> ^1 

for each (i, j) e keys(C) do 
E^EuKiJ)}. 

A(i,j) =A(i,J)A(I,J) + A(I,j) where (7, J) = C(i,j). 
end for 
until n non _ zero = \E\ 
Return (A, C save ). 



4. Experiments 

4.1 Randomized Algorithms for Completability 

In this section we will investigate the set of entries which is completable from a set of given 



entries. In section 2.4 we have seen that the completable set of entries does not depend on the 
value of the entries but only on their position, and Algorithm [3] provides a method to do so. 

In order to illustrate the input and the output of Algorithm [3j we first give an example pair 
of input and output. 

Example 4.1.1. The example input for Algorithm^consists of the mask M, which has ones at the 
position of the known entries, the output is the mask N which has ones at the positions of the entries 
which can be reconstructed up to finite choice. The rank is set to r = 2. For the input 
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Algorithm^computes the output 
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For a quantitative analysis, we perform experiments to investigate how the expected number 



of completable entries is influenced by the number of known entries. In particular, section 2.6 



suggests that a phase transition between the state where only very few additional entries can be 
reconstructed and the state where a large set of entries can be reconstructed should take place 
at some point. Figure [3] shows that this is indeed the case when slowly increasing the number 
of known entries: first, the set of reconstructible entries is roughly equal to the set of known 
entries, but then, a sudden phase transition occurs and the set of reconstructible entries quickly 
reaches the set of all entries. 



4.2 Phase Transitions 

Figure [4] shows phase transition curves of various conditions for 100 x 100 matrices at rank 3. 
We consider uniform sampling model here. More specifically, we generated random 100 x 100 
masks with various number of observed entries by first randomly sampling the order of edges 
(using MATLAB randperm function) and sequentially adding 100 entries at a time from 100 
to 6000. Therefore, we made sure to preserve the monotonicity of the properties considered 
here. This experiment was repeated 100 times and averaged to obtain estimates of success 
probabilities. The conditions plotted are (a) minimum degree at least r, (b) r -connected, (c) 
finitely completable at rank r, (d) r-closable, (e) nuclear norm successful, and (f) one step r- 
closable. We solved the following minimization problem 

rnin ||X|U, 
subject to X(i,j)=A(i,j) V(i,j)eE, 
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15x15 (rank 2) 20 x 20 (rank 5) 




250 



(a) Results for m = 15, n = 15, r = 2 (b) Results for m = 20, n = 20, r = 5 



Figure 3: Expected number of completable entries (in rank r) versus the number of known 
entries where the positions of the known entries are uniformly randomly sampled in an (m x n)- 
matrix. The expected number of completable entries was estimated for each data points from 
repeated calculations of the completable closure (200 for r = 2, and 20 for r = 5). The blue 
solid line is the median, the blue dotted lines are the two other quartiles. The black dotted line 
is the total number of entries, m ■ n. 

where ||X||* = 2ji=i °"jP0 is the nuclear norm of X. The success of nuclear norm minimization 
is defined as the relative error ||X — A|| F /||A|| F less than 0.01, where X is the minimizer of the 
above minimization problem. 

The success probabilities of the (a) minimum degree, (b) r -connected, and (c) finitely com- 
pletable are almost on top of each other, and exceeds chance (probability 0.5) around |£| ~ 
1, 000. The success probability of the (d) r-closable curve passes through 0.5 around |£| ~ 1, 300. 
Therefore the optimality gap of the r-closure method is small. On the other hand, the nuclear 
norm minimization required about 2, 200 entries to succeed with probability larger than 0.5. 

Figure [5] shows the same plot as above for 100 x 100 matrices at rank 6. The success proba- 
bilities of the (a) minimum degree, (b) r -connected, (c) finitely completable are again almost the 
same, and succeeds chance probability 0.5 around |£| ~ 1,400. On the other hand, the number 
of entries required for r-closability is at least 3, 700, whereas that required for the nuclear norm 
minimization to succeed is only 3, 100. 

Figure [6] shows the phase transition from a non-completable mask to a finitely completable 
mask for almost 2r-regular random masks. Here we first randomly sampled n x n 2r-regular 
masks using Steger & Wormald algorithm H41II . Next we randomly permuted the edges included 
in the mask and the edges not included in the mask independently and concatenated them into 
a single list of edges. In this way, we obtained a length mn ordered list of edges that become 2r- 
regular exactly at the 2rnth edge. For each ordered list sampled this way, we took the first 1m— i 
edges and checked whether the mask corresponding to these edges was finitely completable or 
not for i = —15, —14,..., 5. This procedure was repeated 100 times and averaged to obtain a 
probability estimate. In order to make sure that the phase transition is indeed caused by the 
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Figure 4: Phase transition curves of various conditions for 100 x 100 matrices at rank 3. 
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Figure 5: Phase transition curves of various conditions for 100 x 100 matrices at rank 6. 



regularity of the mask, we conducted the same experiment with row-wise 2r-regular masks, i.e., 
each row of the mask contained exactly 2r entries while the number of non-zero entries varied 
from a column to another. 

In Figure [6| the phase transition curves for different n at rank 2 and 3 are shown. The two 
plots in the top part show the results for the 2r-regular masks, and the two plots in the bottom 
show the same results for the 2r-row-wise regular masks. For the 2r-regular masks, the success 
probability of finite completability sharply rises when the number of edges exceeds 2rn — r 2 
(i = —4 for r = 2 and i = —9 for r = 3); the phase transition is already rather sharp for n = 10 
and for n > 20 it becomes almost zero or one. On the other hand, the success probabilities for 
the 2r-row-wise regular masks grow rather slowly and approach zero for large n. This is natural, 
since it is likely for large n that there is some column with non-zero entries less than r, which 



violates the necessary condition in Proposition 2.3.30 
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Figure 6: Phase transition in an almost regular mask. 



5. Conclusion 

In this paper we have shown that Low-Rank Matrix Completion is a task with both algebraic 
and combinatorial structure. We have shown that this structure can be made use of both in the 
theoretical analysis and the construction of algorithms for Matrix Completion. We thus reason 
that using the inherent algebraic structure of a Machine Learning problem is beneficial and thus 
preferrable to structure agnostic methodology. 

For the problem of Matrix Completion, we have also shown that its behavior depends cru- 
cially on the sampling process while only marginally on the generative truth. That is, given some 
entries of a low-rank matrix, the set of entries which can be reconstructed from the entries and 
the condition that the matrix has low-rank, depends only on the position of the known entries, 
and not on their particular values. Similarly, the properties of the reconstruction process can 
be made independent from the values of the entries. We argue that this is more natural than 
assuming the converse, i.e., that the intrinsic properties of the task are determined by the gener- 
ative truth, since the generative truth may change while the problem itself and thus its intrinsic 
properties should not. Indeed, our results on Algebraic Compressed Sensing, which generalize 
our findings on Matrix Completion, seem to imply that this is in fact a general principle in Com- 
pressed Sensing: the properties of the sampling, for example reconstructibility necessary and 
sufficient sampling densities, etc., should be independent of particular signal, and only depen- 
dent on sampling and compression properties. 

We have also presented several combinatorial objects which can be used to study the possible 
set of completable entries in an incomplete low-rank matrix. Namely, one can associate bipartite 
graphs to patterns of entries and study the degrees of freedom, or the set of completable entries 
by analyzing properties of the associated matroid. Moreover, the asymptotics of the necessary 
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number of entries for reconstruction can be now studied via the asymptotics of the bipartite 
graphs and the matroids. The theory of formal differentials can be used to design algorithms for 
calculating the combinatorial objects and their implication for reconstructability. 

The algorithms presented in this paper do not only allow the theoretical studies of the phase 
transitions involved in the Matrix Completion of large matrices, but also give efficient tools to 
the hand of the practitioner to determine which entries of a matrix can be completed or not, i.e., 
which reconstructed entries can be trusted or not, and methods, using the algebraic combinato- 
rial structure, to calculate the reconstruction itself. 

We conjecture that the methods and principles presented in this paper can also be applied to 
a wider class of problems with algebraic-combinatorial structure, in particular 

• Matrix Completion with different constraints. Completing low-rank matrices which are 
symmetric, antisymmetric, Hermitian, real, or endowed with other combinatorial or alge- 
braic constraints can be studied by using analogous methods. Also, the task of completing 
matrices which instead of the low-rank constraint fulfill different algebraic boundary con- 
ditions, e.g., different types of rigidity, or sparsity or hybrid properties, can be recast in our 
framework. 

• Tensor Methods. Similar to matrices, the low-rank tensors can be expressed as being 
contained in an algebraic manifold. Projections of tensors can be treated in a similar way 
to matrices, and the algebraic-combinatorial structures generalize, involving multigraphs 
and their structured matroids. 

• Algebraic Compressed Sensing. If the signal is parameterized by a finite dimensional 
set of parameters, and the compression constraints can be given by polynomial equations, 
many of our methods which are not specific for Low-Rank Matrix Completion apply, in 
particular our theory for the study of the sampling properties independently from signal 
properties. 

Concluding, we argue that the additional use of algebraic or combinatoric structure in a 
Machine Learning problem can only be beneficial compared to not using it. Since Algebraic 
Geometry, Combinatorial Algebra an Discrete Mathematics are the proper tools to analyze and 
utilize such kind of structure, we claim that Machine Learning, as well as the mentioned fields, 
can only profit from a more widespread interdiscplinary collaboration with and between each 
other. 
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A. Sheaves of Matroids 



Several results in section 2.5.3 can be seen as an instance of a single principle in Algebraic 
Geometry, where one considers the dependence structure of sections on a scheme, e.g., a complex 
algebraic variety. The results will imply that the dependencies between the sections exhibit a 
generic behavior, which implies constant behavior on Zariski open sets of the variety. 

The basic idea of associating a generic matroid to a real or complex variety via the formal 
differential is well-known (e.g, [42] contains an explicit discussion) in the combinatorial rigidity 
community and has been derived in a number of concrete cases, for example [1161 [241 l36l |42l |43l 



14511 , among others. Thus, the form of Theorem A. 10 will not surprise experts. However, we are 



unaware of the general statement appearing in print. 

Definition A.l. Let R be an integral domain with field of fractions K, let M be an R-module, 
let A c M be a finite multiset (a set where we allow finite repetitions). To A, we associate a 
matroid M[A] in the following way: The matroid is defined over the power set IP(A) of A, and the 
independent sets are exactly the subsets J c A such that 

dim K (iC-J) = #J 

where K ■ J denotes the K-submodule of K <8> R M generated by J. By convention, K ■ is the trivial 
K-module, i.e., the zero module. 

That M[A] is in fact a matroid follows since K ■ J is a K -vector space. 
Notice that the rank function of M[A] is exactly 

rk(J) = dim K (K-J). 

Definition A.2. Let ip : R — > S be a morphism of integral domains, let M be an R-module. The 
morphism tp induces a map of matroids 

p:M[A]->M[>(A)L 

where y?(A) is the canonical image of A in the tensor product S <8> R M, considered as S-module. 
Accordingly, we will write <p(M[A]) := M[i/?(A)], or M[A] ® R S := M[c/>(A)]. 

Note that the map of matroids is a well-defined homomorphism, as dependent sets are 
mapped onto dependent sets. 

Notations A.3. In the following, letX be an integral, locally Noetherian scheme, let 3~ be a coherent 
sheaf on X. 

Definition A.4. We define a presheaf of matroids Mj, the matroid of sections, in the following way: 
To each Zariski open subset U ofX, we associate the set of all matroids 

]%([/) = {M[A] ; A C J(L7), #A < oo}, 

where 3~(L/) is considered as an Q x (U)-module. To a restriction V Q U of open subsets of X, one 
associates the map of sets of matroids which is induced by the maps of matroids 

M[A] —> res v ,[/(M[A]) 

which is induced by the usual restriction morphism of the structure sheaf 

res v „O x (U)^O x (V). 
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Since the sheaf axioms hold, by assumption, for 3", they directly transfer to the matroid of 
sections Mj, making it a sheaf. 

Proposition A.5. Let V c U c X be open subsets, let M e My([7). Then, as a matroid, M is 
isomorphic to resyu(M). 

Proof. Since X is irreducible, the quotient fields of X (U) and X (V) agree. Thus, the rank 
functions are equivalent for all J and the image of J in X (V), proving isomorphy of matroids. 

□ 

Remark A.6. By going to the direct limit, Proposition A.5\ implies that for U open in X, and x e U, 
any matroid M e Mj(U) is isomorphic to its canonical image in the stalk My x at x. 

Notations A.7. Let x eX, let be the residue field at x. Denote by 

.(x) : J x - ?\ x := J x ® 0xx fc(x) 

the canonical evaluation of 3~ at x. It induces a map on subsets A c 3"([7) with U ^ x, and we will 
write A(x)/or the (element-wise) canonical image of A in 3~| x and call it the evaluation of A at x. 

Definition A.8. We will denote by 

M?\ x ~ 3% x Kx). 

the set of matroids M[A] with A c 3~| X) where 3"| x is canonically considered as k(x)-module. This 
induces a canonical evaluation morphism 

.(x) : M ?jX ^ M 9 \ x , 

and for U ^ x and M e Mj-(L7), we will write M(x)for its canonical image in Mgr| x and call it the 
evaluation of M at x. 

Definition A.9. Let U be an open set in X, let J c 3~([/), let xel.We will denote by rk x (J) the 
number 

rk x (J) = dim fc(x) (fc(x) • J(x)). 

Note that by definition, rk x (J) is equivalent to the matroid rank of J(x) in any matroid in 
Mj| x (where J(x) is contained in the ground set). 

The upper semi-continuity theorem can now be invoked to relate the evaluated matroids 
to the non-evaluated ones, and provide a genericity result on the non-degeneratedness of the 
evaluations. 

Theorem A. 10. Let U an open set in X, let J c 3~(L7). Then, the function 

x ^rk x (J) 

is upper semi-continuous in the Zariski topology. Moreover, there is an open dense subset V Q U 
such that 

rk x (J) = rk^(J) = rk(J) for all x e V, 
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where r\ is the generic point of X, and the last rank function can be considered in an arbitrary 
matroid M e Mg-(L7). 

In particular, for each matroid M e Mj([/), there is an open dense subset V QU such that we 
have isomorphisms of matroids 

M(x) = M(tj) = M for all x e V, 
where r\ is the generic point ofX. 
Proof Since XjX • J is again a coherent sheaf, with 

k(x)-J( X ) = (Oy x -J)\ x , 



Theorem A.ll implies upper semi-continuity of the map x >-* rk x (J). This implies that rk x (J) is 
constant for x e V where V is open dense in U. Moreover, rk^(J) is equivalent to rk(J) for any 



matroid M e M^Lf) due to Remark A.6 



Now let M = M[A] with A C be a matroid in Mj(L7). Since A is finite, the power set 
T(A) also is. By the above, for each J e IP(A), there is Vj, open dense in U, such that 

rk x (J) = rk v (J) = rk(J) for all x e Vj, 

where the last rank function is the rank in M. Set 

V:= 

which again open dense in U since 1P(A) is finite. Since a matroid is uniquely characterized by 
the ranks of all subsets of the ground set, see Proposition 2.5.21 it follows that 

M(x) = M(rj) = M for all x e V, 

which was the statement to prove. □ 



Note that Theorem A. 10 does not imply that the stalks Mj jX agree on an open dense subset 
of X. For sake of completeness, we give the form of the upper semi-continuity theorem which 



was used in the proof of Theorem A. 10 



Theorem A.ll. Let 3" be a coherent sheaf on a locally Noetherian scheme X. Then, for i e N fixed, 
the function 

X^N 

x ~ dim fc(x) (J| x ) 
is upper semi-continuous in the Zariski topology on X. 

Proof. The proof of this theorem is classical and can be found for example in H17II . as Exam- 
ple 12.7.2, or by specializing Theorem 12.8 to the given setting. □ 

Finally, we want to stress the relation between the sheaf of matroids over the differentials, and 



algebraic independence sheaves, which has already surfaced in section 2.5.2 as Theorem 2.5.14 
and implicitly in II38II . 
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Proposition A. 12. Let Y = Specfc, where k is afield of characteristic zero contained in K[X\ 
and let X — > Y be the corresponding morphism of schemes. Denote by £I x /y the relative sheaf of 
differentials of degree one. 

Then, M-n x/Y is the sheaf of all algebraic independence matroids, i.e., it is isomorphic to the 
sheaf S constructed in the following way: For U c X open, the elements of S(U) are the algebraic 
independence matroids of finite subsets of Q X (U) over k, and the restrictions are induced by the 
restrictions of the structure sheaf O x . The isomorphism is given by the canonical differentiation 



d:0> 



n 



X/Y> 



inducing a canonical map S — > M^ x/Y . Thus, for U c X open, r\ the generic point ofX, and x e U 



generic, and M e S(U), one has the isomorphics of matroids 

M = dM(rj) = dM(x), 



Proof. The last directly follows from Theorem A. 10 so it suffices to show isomorphy of 9 and 

il X/Y 

Since d induces a bijection on the underlying sets of M and dM (note that we have allowed 
multisets, so while d may identify elements, they are kept as copies), it suffices to check that 
dependent sets in 0^(10 are mapped to dependent sets in CI x /y(.U), and independent sets in 
0^(L/) are mapped to independent sets in Q X / Y (U). But that is implied by Theorem 16.14 in 
[101. since k has characteristic zero. □ 



Due to Proposition A.12 it is intrinsic to define the following: 



Definition A. 13. Keep the situation of Proposition 
the algebraic independence sheaf of X over Y. 



A.12 



Then the sheaf of matroids 3Vt fi;f/y is called 



Remark A.14. Proposition A.12 indeed gives not only a guarantee that one can always restrict to 
an open dense subset such that the generic matroidal structure is preserved, but also a tool on al- 
gorithmically calculating the generic matroid on irreducible components: namely, sample a random 
point on the component and calculate the linear matroid on the respective elements in the module of 



relative differentials, evaluated at that random point. The results of section 2.5 rephrase that in a 
way which is more specific for the case of Matrix Completion and more hands-on. 
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